Scholars@Duke publication: Stochastic gradient MCMC with stale gradients

Stochastic gradient MCMC with stale gradients

Publication , Conference

Chen, C; Ding, N; Li, C; Zhang, Y; Carin, L

Published in: Advances in Neural Information Processing Systems

January 1, 2016

Stochastic gradient MCMC (SG-MCMC) has played an important role in large-scale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could be directly used in SG-MCMC, their impact on convergence properties has not been well studied. In this paper we develop theory to show that while the bias and MSE of an SG-MCMC algorithm depend on the staleness of stochastic gradients, its estimation variance (relative to the expected estimate, based on a prescribed number of samples) is independent of it. In a simple Bayesian distributed system with SG-MCMC, where stale gradients are computed asynchronously by a set of workers, our theory indicates a linear speedup on the decrease of estimation variance w.r.t. the number of workers. Experiments on synthetic data and deep neural networks validate our theory, demonstrating the effectiveness and scalability of SG-MCMC with stale gradients.

Duke Scholars

Author Lawrence Carin Electrical and Computer Engineering

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2016

Start / End Page

2945 / 2953

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology

Citation

APA

Chicago

ICMJE

MLA

NLM

Chen, C., Ding, N., Li, C., Zhang, Y., & Carin, L. (2016). Stochastic gradient MCMC with stale gradients. In Advances in Neural Information Processing Systems (pp. 2945–2953).

Chen, C., N. Ding, C. Li, Y. Zhang, and L. Carin. “Stochastic gradient MCMC with stale gradients.” In Advances in Neural Information Processing Systems, 2945–53, 2016.

Chen C, Ding N, Li C, Zhang Y, Carin L. Stochastic gradient MCMC with stale gradients. In: Advances in Neural Information Processing Systems. 2016. p. 2945–53.

Chen, C., et al. “Stochastic gradient MCMC with stale gradients.” Advances in Neural Information Processing Systems, 2016, pp. 2945–53.

Chen C, Ding N, Li C, Zhang Y, Carin L. Stochastic gradient MCMC with stale gradients. Advances in Neural Information Processing Systems. 2016. p. 2945–2953.

Published In

Advances in Neural Information Processing Systems

ISSN

1049-5258

Publication Date

January 1, 2016

Start / End Page

2945 / 2953

Related Subject Headings

4611 Machine learning
1702 Cognitive Sciences
1701 Psychology