Skip to main content

SGD converges to global minimum in deep learning via star-convex path

Publication ,  Journal Article
Zhou, Y; Yang, J; Zhang, H; Liang, Y; Tarokh, V
Published in: 7th International Conference on Learning Representations, ICLR 2019
January 1, 2019

© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved. Stochastic gradient descent (SGD) has been found to be surprisingly effective in training a variety of deep neural networks. However, there is still a lack of understanding on how and why SGD can train these complex networks towards a global minimum. In this study, we establish the convergence of SGD to a global minimum for nonconvex optimization problems that are commonly encountered in neural network training. Our argument exploits the following two important properties: 1) the training loss can achieve zero value (approximately), which has been widely observed in deep learning; 2) SGD follows a star-convex path, which is verified by various experiments in this paper. In such a context, our analysis shows that SGD, although has long been considered as a randomized algorithm, converges in an intrinsically deterministic manner to a global minimum.

Duke Scholars

Published In

7th International Conference on Learning Representations, ICLR 2019

Publication Date

January 1, 2019
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhou, Y., Yang, J., Zhang, H., Liang, Y., & Tarokh, V. (2019). SGD converges to global minimum in deep learning via star-convex path. 7th International Conference on Learning Representations, ICLR 2019.
Zhou, Y., J. Yang, H. Zhang, Y. Liang, and V. Tarokh. “SGD converges to global minimum in deep learning via star-convex path.” 7th International Conference on Learning Representations, ICLR 2019, January 1, 2019.
Zhou Y, Yang J, Zhang H, Liang Y, Tarokh V. SGD converges to global minimum in deep learning via star-convex path. 7th International Conference on Learning Representations, ICLR 2019. 2019 Jan 1;
Zhou, Y., et al. “SGD converges to global minimum in deep learning via star-convex path.” 7th International Conference on Learning Representations, ICLR 2019, Jan. 2019.
Zhou Y, Yang J, Zhang H, Liang Y, Tarokh V. SGD converges to global minimum in deep learning via star-convex path. 7th International Conference on Learning Representations, ICLR 2019. 2019 Jan 1;

Published In

7th International Conference on Learning Representations, ICLR 2019

Publication Date

January 1, 2019