Scalable geometric density estimation

Conference Paper

It is standard to assume a low-dimensional structure in estimating a high-dimensional density. However, popular methods, such as probabilistic principal component analysis, scale poorly computationally. We introduce a novel empirical Bayes method that we term geometric density estimation (GEODE) and show that, with mild conditions and among all d-dimensional linear subspaces, the span of the d leading principal axes of the data maximizes the model posterior. With these axes pre-computed using fast singular value decomposition, GEODE easily scales to high dimensional problems while providing uncertainty characterization. The model is also capable of imputing missing data and dynamically deleting redundant dimensions. Finally, we generalize GEODE by mixing it across a dyadic clustering tree. Both simulation studies and real world data applications show superior performance of GEODE in terms of robustness and computational efficiency.

Duke Authors

Cited Authors

  • Wang, Y; Canale, A; Dunson, D

Published Date

  • January 1, 2016

Published In

  • Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Aistats 2016

Start / End Page

  • 857 - 865

Citation Source

  • Scopus