Nonparametric Bayes Modeling of Multivariate Categorical Data.

Published

Journal Article

Modeling of multivariate unordered categorical (nominal) data is a challenging problem, particularly in high dimensions and cases in which one wishes to avoid strong assumptions about the dependence structure. Commonly used approaches rely on the incorporation of latent Gaussian random variables or parametric latent class models. The goal of this article is to develop a nonparametric Bayes approach, which defines a prior with full support on the space of distributions for multiple unordered categorical variables. This support condition ensures that we are not restricting the dependence structure a priori. We show this can be accomplished through a Dirichlet process mixture of product multinomial distributions, which is also a convenient form for posterior computation. Methods for nonparametric testing of violations of independence are proposed, and the methods are applied to model positional dependence within transcription factor binding motifs.

Full Text

Duke Authors

Cited Authors

  • Dunson, DB; Xing, C

Published Date

  • January 2012

Published In

Volume / Issue

  • 104 / 487

Start / End Page

  • 1042 - 1051

PubMed ID

  • 23606777

Pubmed Central ID

  • 23606777

Electronic International Standard Serial Number (EISSN)

  • 1537-274X

International Standard Serial Number (ISSN)

  • 0162-1459

Digital Object Identifier (DOI)

  • 10.1198/jasa.2009.tm08439

Language

  • eng