Topic Modeling with Nonparametric Markov Tree.

Published

Journal Article

A new hierarchical tree-based topic model is developed, based on nonparametric Bayesian techniques. The model has two unique attributes: (i) a child node in the tree may have more than one parent, with the goal of eliminating redundant sub-topics deep in the tree; and (ii) parsimonious sub-topics are manifested, by removing redundant usage of words at multiple scales. The depth and width of the tree are unbounded within the prior, with a retrospective sampler employed to adaptively infer the appropriate tree size based upon the corpus under study. Excellent quantitative results are manifested on five standard data sets, and the inferred tree structure is also found to be highly interpretable.

Full Text

Duke Authors

Cited Authors

  • Chen, H; Dunson, DB; Carin, L

Published Date

  • January 2011

Published In

  • Proceedings of the ... International Conference on Machine Learning. International Conference on Machine Learning

Volume / Issue

  • 2011 /

Start / End Page

  • 377 - 384

PubMed ID

  • 25279387

Pubmed Central ID

  • 25279387

Language

  • eng