Skip to main content

Simplex Factor Models for Multivariate Unordered Categorical Data.

Publication ,  Journal Article
Bhattacharya, A; Dunson, DB
Published in: Journal of the American Statistical Association
March 2012

Gaussian latent factor models are routinely used for modeling of dependence in continuous, binary, and ordered categorical data. For unordered categorical variables, Gaussian latent factor models lead to challenging computation and complex modeling structures. As an alternative, we propose a novel class of simplex factor models. In the single-factor case, the model treats the different categorical outcomes as independent with unknown marginals. The model can characterize flexible dependence structures parsimoniously with few factors, and as factors are added, any multivariate categorical data distribution can be accurately approximated. Using a Bayesian approach for computation and inferences, a Markov chain Monte Carlo (MCMC) algorithm is proposed that scales well with increasing dimension, with the number of factors treated as unknown. We develop an efficient proposal for updating the base probability vector in hierarchical Dirichlet models. Theoretical properties are described, and we evaluate the approach through simulation examples. Applications are described for modeling dependence in nucleotide sequences and prediction from high-dimensional categorical features.

Duke Scholars

Published In

Journal of the American Statistical Association

DOI

EISSN

1537-274X

ISSN

0162-1459

Publication Date

March 2012

Volume

107

Issue

497

Start / End Page

362 / 377

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 3802 Econometrics
  • 1603 Demography
  • 1403 Econometrics
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Bhattacharya, A., & Dunson, D. B. (2012). Simplex Factor Models for Multivariate Unordered Categorical Data. Journal of the American Statistical Association, 107(497), 362–377. https://doi.org/10.1080/01621459.2011.646934
Bhattacharya, Anirban, and David B. Dunson. “Simplex Factor Models for Multivariate Unordered Categorical Data.Journal of the American Statistical Association 107, no. 497 (March 2012): 362–77. https://doi.org/10.1080/01621459.2011.646934.
Bhattacharya A, Dunson DB. Simplex Factor Models for Multivariate Unordered Categorical Data. Journal of the American Statistical Association. 2012 Mar;107(497):362–77.
Bhattacharya, Anirban, and David B. Dunson. “Simplex Factor Models for Multivariate Unordered Categorical Data.Journal of the American Statistical Association, vol. 107, no. 497, Mar. 2012, pp. 362–77. Epmc, doi:10.1080/01621459.2011.646934.
Bhattacharya A, Dunson DB. Simplex Factor Models for Multivariate Unordered Categorical Data. Journal of the American Statistical Association. 2012 Mar;107(497):362–377.

Published In

Journal of the American Statistical Association

DOI

EISSN

1537-274X

ISSN

0162-1459

Publication Date

March 2012

Volume

107

Issue

497

Start / End Page

362 / 377

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 3802 Econometrics
  • 1603 Demography
  • 1403 Econometrics
  • 0104 Statistics