Scholars@Duke publication: Accelerating CNN Training by Pruning Activation Gradients

Accelerating CNN Training by Pruning Activation Gradients

Publication , Conference

Ye, X; Dai, P; Luo, J; Guo, X; Qi, Y; Yang, J; Chen, Y

Published in: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

January 1, 2020

Published version (DOI)

Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the activation gradients in back-propagation are very close to zero and only have a tiny impact on weight-updating. Hence, we consider pruning these very small gradients randomly to accelerate CNN training according to the statistical distribution of activation gradients. Meanwhile, we theoretically analyze the impact of pruning algorithm on the convergence. The proposed approach is evaluated on AlexNet and ResNet-{18, 34, 50, 101, 152} with CIFAR-{10, 100} and ImageNet datasets. Experimental results show that our training approach could substantially achieve up to 5.92 × speedups at back-propagation stage with negligible accuracy loss.

Duke Scholars

Author Yiran Chen Electrical and Computer Engineering

Published In

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

DOI

10.1007/978-3-030-58595-2_20

EISSN

1611-3349

ISSN

0302-9743

Publication Date

January 1, 2020

Volume

12370 LNCS

Start / End Page

322 / 338

Related Subject Headings

Artificial Intelligence & Image Processing
46 Information and computing sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Ye, X., Dai, P., Luo, J., Guo, X., Qi, Y., Yang, J., & Chen, Y. (2020). Accelerating CNN Training by Pruning Activation Gradients. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12370 LNCS, pp. 322–338). https://doi.org/10.1007/978-3-030-58595-2_20

Ye, X., P. Dai, J. Luo, X. Guo, Y. Qi, J. Yang, and Y. Chen. “Accelerating CNN Training by Pruning Activation Gradients.” In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12370 LNCS:322–38, 2020. https://doi.org/10.1007/978-3-030-58595-2_20.

Ye X, Dai P, Luo J, Guo X, Qi Y, Yang J, et al. Accelerating CNN Training by Pruning Activation Gradients. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2020. p. 322–38.

Ye, X., et al. “Accelerating CNN Training by Pruning Activation Gradients.” Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12370 LNCS, 2020, pp. 322–38. Scopus, doi:10.1007/978-3-030-58595-2_20.

Ye X, Dai P, Luo J, Guo X, Qi Y, Yang J, Chen Y. Accelerating CNN Training by Pruning Activation Gradients. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2020. p. 322–338.

Published In

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

DOI

10.1007/978-3-030-58595-2_20

EISSN

1611-3349

ISSN

0302-9743

Publication Date

January 1, 2020

Volume

12370 LNCS

Start / End Page

322 / 338

Related Subject Headings

Artificial Intelligence & Image Processing
46 Information and computing sciences