Scholars@Duke publication: Learning mixtures of gaussians in high dimensions

Learning mixtures of gaussians in high dimensions

Publication , Conference

Ge, R; Huang, Q; Kakade, SM

Published in: Proceedings of the Annual ACM Symposium on Theory of Computing

June 14, 2015

Efficiently learning mixture of Gaussians is a fundamental problem in statistics and learning theory. Given samples coming from a random one out of k Gaussian distributions in Rn, the learning problem asks to estimate the means and the covariance matrices of these Gaussians. This learning problem arises in many areas ranging from the natural sciences to the social sciences, and has also found many machine learning applications. Unfortunately, learning mixture of Gaussians is an information theoretically hard problem: in order to learn the parameters up to a reasonable accuracy, the number of samples required is exponential in the number of Gaussian components in the worst case. In this work, we show that provided we are in high enough dimensions, the class of Gaussian mixtures is learnable in its most general form under a smoothed analysis framework, where the parameters are randomly perturbed from an adversarial starting point. In particular, given samples from a mixture of Gaussians with randomly perturbed parameters, when n ≥ ?(k2), we give an algorithm that learns the parameters with polynomial running time and using polynomial number of samples. The central algorithmic ideas consist of new ways to decompose the moment tensor of the Gaussian mixture by exploiting its structural properties. The symmetries of this tensor are derived from the combinatorial structure of higher order moments of Gaussian distributions (sometimes referred to as Isserlis' theorem or Wick's theorem). We also develop new tools for bounding smallest singular values of structured random matrices, which could be useful in other smoothed analysis settings.

Duke Scholars

Author Rong Ge Computer Science

Published In

Proceedings of the Annual ACM Symposium on Theory of Computing

DOI

10.1145/2746539.2746616

ISSN

0737-8017

ISBN

9781450335362

Publication Date

June 14, 2015

Volume

14-17-June-2015

Start / End Page

761 / 770

Citation

APA

Chicago

ICMJE

MLA

NLM

Ge, R., Huang, Q., & Kakade, S. M. (2015). Learning mixtures of gaussians in high dimensions. In Proceedings of the Annual ACM Symposium on Theory of Computing (Vol. 14-17-June-2015, pp. 761–770). https://doi.org/10.1145/2746539.2746616

Ge, R., Q. Huang, and S. M. Kakade. “Learning mixtures of gaussians in high dimensions.” In Proceedings of the Annual ACM Symposium on Theory of Computing, 14-17-June-2015:761–70, 2015. https://doi.org/10.1145/2746539.2746616.

Ge R, Huang Q, Kakade SM. Learning mixtures of gaussians in high dimensions. In: Proceedings of the Annual ACM Symposium on Theory of Computing. 2015. p. 761–70.

Ge, R., et al. “Learning mixtures of gaussians in high dimensions.” Proceedings of the Annual ACM Symposium on Theory of Computing, vol. 14-17-June-2015, 2015, pp. 761–70. Scopus, doi:10.1145/2746539.2746616.

Ge R, Huang Q, Kakade SM. Learning mixtures of gaussians in high dimensions. Proceedings of the Annual ACM Symposium on Theory of Computing. 2015. p. 761–770.

Published In

Proceedings of the Annual ACM Symposium on Theory of Computing

DOI

10.1145/2746539.2746616

ISSN

0737-8017

ISBN

9781450335362

Publication Date

June 14, 2015

Volume

14-17-June-2015

Start / End Page

761 / 770