Latent Gaussian models for topic modeling

Conference Paper

A new approach is proposed for topic modeling, in which the latent matrix factorization employs Gaussian priors, rather than the Dirichlet-class priors widely used in such models. The use of a latent-Gaussian model permits simple and efficient approximate Bayesian posterior inference, via the Laplace approximation. On multiple datasets, the proposed approach is demonstrated to yield results as accurate as state-of-the-art approaches based on Dirichlet constructions, at a small fraction of the computation. The framework is general enough to jointly model text and binary data, here demonstrated to produce accurate and fast results for joint analysis of voting rolls and the associated legislative text. Further, it is demonstrated how the technique may be scaled up to massive data, with encouraging performance relative to alternative methods.

Duke Authors

Cited Authors

  • Hu, C; Ryu, E; Carlson, D; Wang, Y; Carin, L

Published Date

  • January 1, 2014

Published In

Volume / Issue

  • 33 /

Start / End Page

  • 393 - 401

Electronic International Standard Serial Number (EISSN)

  • 1533-7928

International Standard Serial Number (ISSN)

  • 1532-4435

Citation Source

  • Scopus