Third-order smoothness helps: Faster stochastic optimization algorithms for finding local minima
We propose stochastic optimization algorithms that can find local minima faster than existing algorithms for nonconvex optimization problems, by exploiting the third-order smoothness to escape non-degenerate saddle points more efficiently. More specifically, the proposed algorithm only needs Oe(ℰ-10/3) stochastic gradient evaluations to converge to an approximate local minimum x, which satisfies krf(x)k2 ≤ℰ and λmin(r2f(x)) ≥ -pℰ in unconstrained stochastic optimization, where Oe(·) hides logarithm polynomial terms and constants. This improves upon the Oe(ℰ-7/2) gradient complexity achieved by the state-of-the-art stochastic local minima finding algorithms by a factor of Oe(ℰ-1/6). Experiments on two nonconvex optimization problems demonstrate the effectiveness of our algorithm and corroborate our theory.
Duke Scholars
Published In
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- 4611 Machine learning
- 1702 Cognitive Sciences
- 1701 Psychology
Citation
Published In
ISSN
Publication Date
Volume
Start / End Page
Related Subject Headings
- 4611 Machine learning
- 1702 Cognitive Sciences
- 1701 Psychology