Scholars@Duke publication: Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

Publication , Journal Article

Guo, P; Ye, Z; Xiao, K; Zhu, W

Published in: IEEE Transactions on Knowledge and Data Engineering

October 1, 2022

This paper investigates the stochastic optimization problem focusing on developing scalable parallel algorithms for deep learning tasks. Our solution involves a reformation of the objective function for stochastic optimization in neural network models, along with a novel parallel computing strategy, coined the weighted aggregating stochastic gradient descent (WASGD). Following a theoretical analysis on the characteristics of the new objective function, WASGD introduces a decentralized weighted aggregating scheme based on the performance of local workers. Without any center variable, the new method automatically gauges the importance of local workers and accepts them by their contributions. Furthermore, we have developed an enhanced version of the method, WASGD+, by (1) implementing a designed sample order and (2) upgrading the weight evaluation function. To validate the new method, we benchmark our pipeline against several popular algorithms including the state-of-the-art deep neural network classifier training techniques (e.g., elastic averaging SGD). Comprehensive validation studies have been conducted on four classic datasets: CIFAR-100, CIFAR-10, Fashion-MNIST, and MNIST. Subsequent results have firmly validated the superiority of the WASGD scheme in accelerating the training of deep architecture. Better still, the enhanced version, WASGD+, is shown to be a significant improvement over its prototype.

Duke Scholars

Author Pengzhan Guo DKU Faculty

Published In

IEEE Transactions on Knowledge and Data Engineering

DOI

10.1109/TKDE.2020.3047894

EISSN

1558-2191

ISSN

1041-4347

Publication Date

October 1, 2022

Volume

Issue

Start / End Page

5037 / 5050

Related Subject Headings

Information Systems
46 Information and computing sciences
08 Information and Computing Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Guo, P., Ye, Z., Xiao, K., & Zhu, W. (2022). Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning. IEEE Transactions on Knowledge and Data Engineering, 34(10), 5037–5050. https://doi.org/10.1109/TKDE.2020.3047894

Guo, P., Z. Ye, K. Xiao, and W. Zhu. “Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning.” IEEE Transactions on Knowledge and Data Engineering 34, no. 10 (October 1, 2022): 5037–50. https://doi.org/10.1109/TKDE.2020.3047894.

Guo P, Ye Z, Xiao K, Zhu W. Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning. IEEE Transactions on Knowledge and Data Engineering. 2022 Oct 1;34(10):5037–50.

Guo, P., et al. “Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning.” IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 10, Oct. 2022, pp. 5037–50. Scopus, doi:10.1109/TKDE.2020.3047894.

Guo P, Ye Z, Xiao K, Zhu W. Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning. IEEE Transactions on Knowledge and Data Engineering. 2022 Oct 1;34(10):5037–5050.

Published In

IEEE Transactions on Knowledge and Data Engineering

DOI

10.1109/TKDE.2020.3047894

EISSN

1558-2191

ISSN

1041-4347

Publication Date

October 1, 2022

Volume

Issue

Start / End Page

5037 / 5050

Related Subject Headings

Information Systems
46 Information and computing sciences
08 Information and Computing Sciences