Skip to main content

Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering.

Publication ,  Journal Article
Chandra, NK; Canale, A; Dunson, DB
Published in: Journal of machine learning research : JMLR
April 2023

Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.

Duke Scholars

Published In

Journal of machine learning research : JMLR

EISSN

1533-7928

ISSN

1532-4435

Publication Date

April 2023

Volume

24

Start / End Page

144

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4905 Statistics
  • 4611 Machine learning
  • 17 Psychology and Cognitive Sciences
  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Chandra, N. K., Canale, A., & Dunson, D. B. (2023). Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering. Journal of Machine Learning Research : JMLR, 24, 144.
Chandra, Noirrit Kiran, Antonio Canale, and David B. Dunson. “Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering.Journal of Machine Learning Research : JMLR 24 (April 2023): 144.
Chandra NK, Canale A, Dunson DB. Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering. Journal of machine learning research : JMLR. 2023 Apr;24:144.
Chandra, Noirrit Kiran, et al. “Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering.Journal of Machine Learning Research : JMLR, vol. 24, Apr. 2023, p. 144.
Chandra NK, Canale A, Dunson DB. Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering. Journal of machine learning research : JMLR. 2023 Apr;24:144.

Published In

Journal of machine learning research : JMLR

EISSN

1533-7928

ISSN

1532-4435

Publication Date

April 2023

Volume

24

Start / End Page

144

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4905 Statistics
  • 4611 Machine learning
  • 17 Psychology and Cognitive Sciences
  • 08 Information and Computing Sciences