Skip to main content

Stochastic nested variance reduction for nonconvex optimization

Publication ,  Journal Article
Zhou, D; Xu, P; Gu, Q
Published in: Journal of Machine Learning Research
May 1, 2020

We study nonconvex optimization problems, where the objective function is either an average of n nonconvex functions or the expectation of some stochastic function. We propose a new stochastic gradient descent algorithm based on nested variance reduction, namely, Stochastic Nested Variance-Reduced Gradient descent (SNVRG). Compared with conventional stochastic variance reduced gradient (SVRG) algorithm that uses two reference points to construct a semi-stochastic gradient with diminishing variance in each iteration, our algorithm uses K + 1 nested reference points to build a semi-stochastic gradient to further reduce its variance in each iteration. For smooth nonconvex functions, SNVRG converges to an ε-approximate first-order stationary point within Oe(n∧ε−2 + ε−3 ∧n1/2ε−2)1 number of stochastic gradient evaluations. This improves the best known gradient complexity of SVRG O(n+ n2/3ε−2) and that of SCSG O(n∧ε−2 + ε−10/3 ∧n2/3ε−2). For gradient dominated functions, SNVRG also achieves better gradient complexity than the state-of-the-art algorithms. Based on SNVRG, we further propose two algorithms that can find local minima faster than state-of-the-art algorithms in both finite-sum and general stochastic (online) nonconvex optimization. In particular, for finite-sum optimization problems, the proposed SNVRG + Neon2finite algorithm achieves Oe(n1/2ε−2 + nε−H3 + n3/4ε−H7/2) gradient complexity to converge to an (ε, εH)-second-order stationary point, which outperforms SVRG+Neon2finite (Allen-Zhu and Li, 2018), the best existing algorithm, in a wide regime. For general stochastic optimization problems, the proposed SNVRG + Neon2online achieves Oe(ε−3 + ε−H5 + ε−2ε−H3) gradient complexity, which is better than both SVRG + Neon2online (Allen-Zhu and Li, 2018) and Natasha2 (Allen-Zhu, 2018a) in certain regimes. Thorough experimental results on different nonconvex optimization problems back up our theory.

Duke Scholars

Published In

Journal of Machine Learning Research

EISSN

1533-7928

ISSN

1532-4435

Publication Date

May 1, 2020

Volume

21

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4905 Statistics
  • 4611 Machine learning
  • 17 Psychology and Cognitive Sciences
  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhou, D., Xu, P., & Gu, Q. (2020). Stochastic nested variance reduction for nonconvex optimization. Journal of Machine Learning Research, 21.
Zhou, D., P. Xu, and Q. Gu. “Stochastic nested variance reduction for nonconvex optimization.” Journal of Machine Learning Research 21 (May 1, 2020).
Zhou D, Xu P, Gu Q. Stochastic nested variance reduction for nonconvex optimization. Journal of Machine Learning Research. 2020 May 1;21.
Zhou, D., et al. “Stochastic nested variance reduction for nonconvex optimization.” Journal of Machine Learning Research, vol. 21, May 2020.
Zhou D, Xu P, Gu Q. Stochastic nested variance reduction for nonconvex optimization. Journal of Machine Learning Research. 2020 May 1;21.

Published In

Journal of Machine Learning Research

EISSN

1533-7928

ISSN

1532-4435

Publication Date

May 1, 2020

Volume

21

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4905 Statistics
  • 4611 Machine learning
  • 17 Psychology and Cognitive Sciences
  • 08 Information and Computing Sciences