Scalable and robust Bayesian inference via the median posterior

Conference Paper

Many Bayesian learning methods for massive data benefit from working with small subsets of observations. In particular, significant progress has been made in scalable Bayesian learning via stochastic approximation. However, Bayesian learning methods in distributed computing environments are often problem- or distribution-specific and use ad hoc techniques. We propose a novel general approach to Bayesian inference that is scalable and robust to corruption in the data. Our technique is based on the idea of splitting the data into several non-overlapping subgroups, evaluating the posterior distribution given each independent subgroup, and then combining the results. Our main contribution is the proposed aggregation step which is based on finding the geometric median of subset posterior distributions. Presented theoretical and numerical results confirm the advantages of our approach.

Duke Authors

Cited Authors

  • Minsker, S; Srivastava, S; Lin, L; Dunson, DB

Published Date

  • January 1, 2014

Published In

  • 31st International Conference on Machine Learning, Icml 2014

Volume / Issue

  • 5 /

Start / End Page

  • 3629 - 3639

International Standard Book Number 13 (ISBN-13)

  • 9781634393973

Citation Source

  • Scopus