Skip to main content

Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization

Publication ,  Journal Article
Banks, J; Moore, C; Vershynin, R; Verzelen, N; Xu, J
Published in: IEEE Transactions on Information Theory
July 1, 2018

We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these 'planted models' and null models where the signal matrix is zero. For sparse PCA and submatrix localization, we determine this threshold exactly in the limit where the number of blocks is large or the signal matrix is very sparse; for the clustering problem, our bounds differ by a factor of 2 when the number of clusters is large. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information-theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured 'hard but detectable' regime for community detection in sparse graphs.

Duke Scholars

Published In

IEEE Transactions on Information Theory

DOI

ISSN

0018-9448

Publication Date

July 1, 2018

Volume

64

Issue

7

Start / End Page

4872 / 4894

Related Subject Headings

  • Networking & Telecommunications
  • 4613 Theory of computation
  • 4006 Communications engineering
  • 1005 Communications Technologies
  • 0906 Electrical and Electronic Engineering
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Banks, J., Moore, C., Vershynin, R., Verzelen, N., & Xu, J. (2018). Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization. IEEE Transactions on Information Theory, 64(7), 4872–4894. https://doi.org/10.1109/TIT.2018.2810020
Banks, J., C. Moore, R. Vershynin, N. Verzelen, and J. Xu. “Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization.” IEEE Transactions on Information Theory 64, no. 7 (July 1, 2018): 4872–94. https://doi.org/10.1109/TIT.2018.2810020.
Banks J, Moore C, Vershynin R, Verzelen N, Xu J. Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization. IEEE Transactions on Information Theory. 2018 Jul 1;64(7):4872–94.
Banks, J., et al. “Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization.” IEEE Transactions on Information Theory, vol. 64, no. 7, July 2018, pp. 4872–94. Scopus, doi:10.1109/TIT.2018.2810020.
Banks J, Moore C, Vershynin R, Verzelen N, Xu J. Information-Theoretic Bounds and Phase Transitions in Clustering, Sparse PCA, and Submatrix Localization. IEEE Transactions on Information Theory. 2018 Jul 1;64(7):4872–4894.

Published In

IEEE Transactions on Information Theory

DOI

ISSN

0018-9448

Publication Date

July 1, 2018

Volume

64

Issue

7

Start / End Page

4872 / 4894

Related Subject Headings

  • Networking & Telecommunications
  • 4613 Theory of computation
  • 4006 Communications engineering
  • 1005 Communications Technologies
  • 0906 Electrical and Electronic Engineering
  • 0801 Artificial Intelligence and Image Processing