Dynamic rank factor model for text streams

Published

Conference Paper

We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (i) discovering topic prevalence over time, and (ii) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backward-sampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science.

Duke Authors

Cited Authors

  • Han, S; Du, L; Salazar, E; Carin, L

Published Date

  • January 1, 2014

Published In

Volume / Issue

  • 3 / January

Start / End Page

  • 2663 - 2671

International Standard Serial Number (ISSN)

  • 1049-5258

Citation Source

  • Scopus