Scholars@Duke publication: Efficient Gaussian process regression for large datasets

Efficient Gaussian process regression for large datasets

Publication , Journal Article

Banerjee, A; Dunson, DB; Tokdar, ST

Published in: Biometrika

2013

Published version (DOI) Open Access Copy (Duke)

Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n3 where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples. © 2012 Biometrika Trust.

Duke Scholars

Author David B. Dunson Statistical Science

Author Surya Tapas Tokdar Statistical Science

Published In

Biometrika

DOI

10.1093/biomet/ass068

ISSN

0006-3444

Publication Date

2013

Volume

100

Issue

Start / End Page

75 / 89

Related Subject Headings

Statistics & Probability
4905 Statistics
3802 Econometrics
1403 Econometrics
0104 Statistics
0103 Numerical and Computational Mathematics

Citation

APA

Chicago

ICMJE

MLA

NLM

Banerjee, A., Dunson, D. B., & Tokdar, S. T. (2013). Efficient Gaussian process regression for large datasets. Biometrika, 100(1), 75–89. https://doi.org/10.1093/biomet/ass068

Banerjee, A., D. B. Dunson, and S. T. Tokdar. “Efficient Gaussian process regression for large datasets.” Biometrika 100, no. 1 (2013): 75–89. https://doi.org/10.1093/biomet/ass068.

Banerjee A, Dunson DB, Tokdar ST. Efficient Gaussian process regression for large datasets. Biometrika. 2013;100(1):75–89.

Banerjee, A., et al. “Efficient Gaussian process regression for large datasets.” Biometrika, vol. 100, no. 1, 2013, pp. 75–89. Scival, doi:10.1093/biomet/ass068.

Banerjee A, Dunson DB, Tokdar ST. Efficient Gaussian process regression for large datasets. Biometrika. 2013;100(1):75–89.

Published In

Biometrika

DOI

10.1093/biomet/ass068

ISSN

0006-3444

Publication Date

2013

Volume

100

Issue

Start / End Page

75 / 89

Related Subject Headings

Statistics & Probability
4905 Statistics
3802 Econometrics
1403 Econometrics
0104 Statistics
0103 Numerical and Computational Mathematics