Journal Article (Journal Article)

This article is motivated by the problem of studying the joint effect of different chemical exposures on human health outcomes. This is essentially a nonparametric regression problem, with interest being focused not on a black box for prediction but instead on selection of main effects and interactions. For interpretability we decompose the expected health outcome into a linear main effect, pairwise interactions and a nonlinear deviation. Our interest is in model selection for these different components, accounting for uncertainty and addressing nonidentifiability between the linear and nonparametric components of the semiparametric model. We propose a Bayesian approach to inference, placing variable selection priors on the different components, and developing a Markov chain Monte Carlo (MCMC) algorithm. A key component of our approach is the incorporation of a heredity constraint to only include interactions in the presence of main effects, effectively reducing dimensionality of the model search. We adapt a projection approach developed in the spatial statistics literature to enforce identifiability in modeling the nonparametric component using a Gaussian process. We also employ a dimension reduction strategy to sample the nonlinear random effects that aids the mixing of the MCMC algorithm. The proposed MixSelect framework is evaluated using a simulation study, and is illustrated using data from the National Health and Nutrition Examination Survey (NHANES). Code is available on GitHub.

Full Text

Duke Authors

Cited Authors

  • Ferrari, F; Dunson, DB

Published Date

  • December 2020

Published In

Volume / Issue

  • 14 / 4

Start / End Page

  • 1743 - 1758

PubMed ID

  • 34630816

Pubmed Central ID

  • PMC8500234

Electronic International Standard Serial Number (EISSN)

  • 1941-7330

International Standard Serial Number (ISSN)

  • 1932-6157

Digital Object Identifier (DOI)

  • 10.1214/20-aoas1363


  • eng