Scholars@Duke publication: On the local minima of the empirical risk

On the local minima of the empirical risk

Publication , Conference

Jin, C; Ge, R; Liu, LT; Jordan, MI

Published in: Advances in Neural Information Processing Systems

January 1, 2018

Population risk is always of primary interest in machine learning; however, learning algorithms only have access to the empirical risk. Even for applications with nonconvex nonsmooth losses (such as modern deep networks), the population risk is generally significantly more well-behaved from an optimization point of view than the empirical risk. In particular, sampling can create many spurious local minima. We consider a general framework which aims to optimize a smooth nonconvex function F (population risk) given only access to an approximation f (empirical risk) that is pointwise close to F (i.e., kF − fk∞ ≤ ν). Our objective is to find the -approximate local minima of the underlying function F while avoiding the shallow local minima-arising because of the tolerance ν-which exist only in f. We propose a simple algorithm based on stochastic gradient descent (SGD) on a smoothed version of f that is guaranteed to achieve our goal as long as ν ≤ O(^1.5/d). We also provide an almost matching lower bound showing that our algorithm achieves optimal error tolerance ν among all algorithms making a polynomial number of queries of f. As a concrete example, we show that our results can be directly used to give sample complexities for learning a ReLU unit.

Duke Scholars

Author Rong Ge Computer Science

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2018

Volume

2018-December

Start / End Page

4896 / 4905

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology

Citation

APA

Chicago

ICMJE

MLA

NLM

Jin, C., Ge, R., Liu, L. T., & Jordan, M. I. (2018). On the local minima of the empirical risk. In Advances in Neural Information Processing Systems (Vol. 2018-December, pp. 4896–4905).

Jin, C., R. Ge, L. T. Liu, and M. I. Jordan. “On the local minima of the empirical risk.” In Advances in Neural Information Processing Systems, 2018-December:4896–4905, 2018.

Jin C, Ge R, Liu LT, Jordan MI. On the local minima of the empirical risk. In: Advances in Neural Information Processing Systems. 2018. p. 4896–905.

Jin, C., et al. “On the local minima of the empirical risk.” Advances in Neural Information Processing Systems, vol. 2018-December, 2018, pp. 4896–905.

Jin C, Ge R, Liu LT, Jordan MI. On the local minima of the empirical risk. Advances in Neural Information Processing Systems. 2018. p. 4896–4905.

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2018

Volume

2018-December

Start / End Page

4896 / 4905

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology