Skip to main content

Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation

Publication ,  Journal Article
Feng, Y; Gao, T; Li, L; Liu, JG; Lu, Y
Published in: Communications in Mathematical Sciences
January 1, 2020

Diffusion approximation provides weak approximation for stochastic gradient descent algorithms in a finite time horizon. In this paper, we introduce new tools motivated by the backward error analysis of numerical stochastic differential equations into the theoretical framework of diffusion approximation, extending the validity of the weak approximation from finite to infinite time horizon. The new techniques developed in this paper enable us to characterize the asymptotic behavior of constant-step-size SGD algorithms near a local minimum around which the objective functions are locally strongly convex, a goal previously unreachable within the diffusion approximation framework. Our analysis builds upon a truncated formal power expansion of the solution of a Kolmogorov equation arising from diffusion approximation, where the main technical ingredient is uniform-in-time bounds controlling the long-term behavior of the expansion coefficient functions near the local minimum. We expect these new techniques to bring new understanding of the behaviors of SGD near local minimum and greatly expand the range of applicability of diffusion approximation to cover wider and deeper aspects of stochastic optimization algorithms in data science.

Duke Scholars

Published In

Communications in Mathematical Sciences

DOI

EISSN

1945-0796

ISSN

1539-6746

Publication Date

January 1, 2020

Volume

18

Issue

1

Start / End Page

163 / 188

Related Subject Headings

  • Applied Mathematics
  • 4904 Pure mathematics
  • 4901 Applied mathematics
  • 1502 Banking, Finance and Investment
  • 0102 Applied Mathematics
  • 0101 Pure Mathematics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Feng, Y., Gao, T., Li, L., Liu, J. G., & Lu, Y. (2020). Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation. Communications in Mathematical Sciences, 18(1), 163–188. https://doi.org/10.4310/CMS.2020.v18.n1.a7
Feng, Y., T. Gao, L. Li, J. G. Liu, and Y. Lu. “Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation.” Communications in Mathematical Sciences 18, no. 1 (January 1, 2020): 163–88. https://doi.org/10.4310/CMS.2020.v18.n1.a7.
Feng Y, Gao T, Li L, Liu JG, Lu Y. Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation. Communications in Mathematical Sciences. 2020 Jan 1;18(1):163–88.
Feng, Y., et al. “Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation.” Communications in Mathematical Sciences, vol. 18, no. 1, Jan. 2020, pp. 163–88. Scopus, doi:10.4310/CMS.2020.v18.n1.a7.
Feng Y, Gao T, Li L, Liu JG, Lu Y. Uniform-in-time weak error analysis for stochastic gradient descent algorithms via diffusion approximation. Communications in Mathematical Sciences. 2020 Jan 1;18(1):163–188.

Published In

Communications in Mathematical Sciences

DOI

EISSN

1945-0796

ISSN

1539-6746

Publication Date

January 1, 2020

Volume

18

Issue

1

Start / End Page

163 / 188

Related Subject Headings

  • Applied Mathematics
  • 4904 Pure mathematics
  • 4901 Applied mathematics
  • 1502 Banking, Finance and Investment
  • 0102 Applied Mathematics
  • 0101 Pure Mathematics