Skip to main content

ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets

Publication ,  Journal Article
Li, K; Mak, S
Published in: Journal of Computational and Graphical Statistics
January 1, 2025

Gaussian processes (GPs) are a popular class of Bayesian nonparametric models, but its training can be computationally burdensome for massive training datasets. While there has been notable work on scaling up these models for big data, existing methods typically rely on a stationary GP assumption for approximation, and can thus perform poorly when the underlying response surface is nonstationary, that is, it has some regions of rapid change and other regions with little change. Such non-stationarity is, however, ubiquitous in real-world problems, including our motivating application for surrogate modeling of computer experiments. We propose a new Product of Sparse GP (ProSpar-GP) method for scalable GP modeling with massive nonstationary data. The ProSpar-GP makes use of a carefully-constructed product-of-experts formulation of sparse GP experts, where different experts are placed within local regions of non-stationarity. These GP experts are fit via a novel variational inference approach, which capitalizes on mini-batching and GPU acceleration for efficient optimization of inducing points and length-scale parameters for each expert. We further show that the ProSpar-GP is Kolmogorov-consistent, in that its generative distribution defines a valid stochastic process over the prediction space; such a property provides essential stability for variational inference, particularly in the presence of non-stationarity. We then demonstrate the improved performance of the ProSpar-GP over the state-of-the-art, in a suite of numerical experiments and an application for surrogate modeling of a satellite drag simulator. Supplemental materials for this article are available online.

Duke Scholars

Published In

Journal of Computational and Graphical Statistics

DOI

EISSN

1537-2715

ISSN

1061-8600

Publication Date

January 1, 2025

Volume

34

Issue

4

Start / End Page

1742 / 1759

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Li, K., & Mak, S. (2025). ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets. Journal of Computational and Graphical Statistics, 34(4), 1742–1759. https://doi.org/10.1080/10618600.2025.2490264
Li, K., and S. Mak. “ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets.” Journal of Computational and Graphical Statistics 34, no. 4 (January 1, 2025): 1742–59. https://doi.org/10.1080/10618600.2025.2490264.
Li K, Mak S. ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets. Journal of Computational and Graphical Statistics. 2025 Jan 1;34(4):1742–59.
Li, K., and S. Mak. “ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets.” Journal of Computational and Graphical Statistics, vol. 34, no. 4, Jan. 2025, pp. 1742–59. Scopus, doi:10.1080/10618600.2025.2490264.
Li K, Mak S. ProSpar-GP: Scalable Gaussian Process Modeling with Massive Nonstationary Datasets. Journal of Computational and Graphical Statistics. 2025 Jan 1;34(4):1742–1759.

Published In

Journal of Computational and Graphical Statistics

DOI

EISSN

1537-2715

ISSN

1061-8600

Publication Date

January 1, 2025

Volume

34

Issue

4

Start / End Page

1742 / 1759

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics