Skip to main content

Merlise Clyde

Professor of Statistical Science
Statistical Science
Duke Box 90251, Durham, NC 27708-0251
223E Old Chem Bldg, Box 90251, Durham, NC 27708

Selected Publications


A tutorial on Bayesian multi-model linear regression with BAS and JASP.

Journal Article Behavior research methods · December 2021 Linear regression analyses commonly involve two consecutive stages of statistical inquiry. In the first stage, a single 'best' model is defined by a specific selection of relevant predictors; in the second stage, the regression coefficients of the winning ... Full text Open Access Cite

Mixtures of g-priors in Generalized Linear Models

Journal Article Journal of the American Statistical Association · December 1, 2018 Featured Publication Mixtures of Zellner's g-priors have been studied extensively in linear models and have been shown to have numerous desirable properties for Bayesian variable selection and model averaging. Several extensions of g-priors to Generalized Linear Models (GLMs) ... Full text Open Access Link to item Cite

Age modification of ozone associations with cardiovascular disease risk in adults: a potential role for soluble P-selectin and blood pressure.

Journal Article Journal of thoracic disease · July 2018 BackgroundStudies have suggested that age increases susceptibility to ozone-associated mortality, but the underlying mechanisms are unclear. In a previous study, personal exposure to ozone was significantly associated with a platelet activation bi ... Full text Cite

Combined use of an electrostatic precipitator and a high-efficiency particulate air filter in building ventilation systems: Effects on cardiorespiratory health indicators in healthy adults.

Journal Article Indoor air · May 2018 High-efficiency particulate air (HEPA) filtration in combination with an electrostatic precipitator (ESP) can be a cost-effective approach to reducing indoor particulate exposure, but ESPs produce ozone. The health effect of combined ESP-HEPA filtration ha ... Full text Cite

Redefine statistical significance.

Journal Article Nature human behaviour · January 2018 Featured Publication Full text Open Access Cite

Risk prediction for ovarian cancer: epidemiologic risk factors plus confirmed genetic loci

Conference BJOG-AN INTERNATIONAL JOURNAL OF OBSTETRICS AND GYNAECOLOGY · March 1, 2017 Link to item Cite

Risk Prediction for Epithelial Ovarian Cancer in 11 United States-Based Case-Control Studies: Incorporation of Epidemiologic Risk Factors and 17 Confirmed Genetic Loci.

Journal Article Am J Epidemiol · October 15, 2016 Previously developed models for predicting absolute risk of invasive epithelial ovarian cancer have included a limited number of risk factors and have had low discriminatory power (area under the receiver operating characteristic curve (AUC) < 0.60). Becau ... Full text Open Access Link to item Cite

RISK PREDICTION FOR OVARIAN CANCER: EPIDEMIOLOGIC RISK FACTORS PLUS CONFIRMED GENETIC LOCI

Conference INTERNATIONAL JOURNAL OF GYNECOLOGICAL CANCER · October 1, 2016 Link to item Cite

127 Multimodality Word-Finding Distinctions in Pediatric Cortical Stimulation Mapping.

Journal Article Neurosurgery · August 2016 INTRODUCTION: Recently, auditory naming has become a part of cortical stimulation mapping (CSM) to provide a comprehensive language map prior to resection in epilepsy patients. Modality-specific language sites have been found using CSM in adult epilepsy pa ... Full text Link to item Cite

BAS: Bayesian Model Averaging using Bayesian Adaptive Sampling

Software · 2016 Package for Bayesian Model Averaging in linear models using stochastic or deterministic sampling without replacement from posterior distributions. Prior distributions on coefficients are from Zellner's g-prior or mixtures of g-priors corresponding to the Z ... Full text Link to item Cite

Experimental Design: Bayesian Designs

Chapter · 2015 This article provides an overview of experimental design using a Bayesian decision-theoretic framework. Scientific experimentation requires decisions about how an experiment will be conducted and analyzed. Such decisions depend on the goals and purpose of ... Full text Link to item Cite

Experimental Design: Bayesian Designs

Chapter · 2015 This article provides an overview of experimental design using a Bayesian decision-theoretic framework. Scientific experimentation requires decisions about how an experiment will be conducted and analyzed. Such decisions depend on the goals and purpose of ... Full text Link to item Cite

Functional Annotation Signatures of Disease Susceptibility Loci Improve SNP Association Analysis

Journal Article BMC Genomics · 2014 Background Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional anno ... Full text Open Access Cite

Finite population estimators in stochastic search variable selection

Journal Article Biometrika · December 1, 2012 Monte Carlo algorithms are commonly used to identify a set of models for Bayesian model selection or model averaging. Because empirical frequencies of models are often zero or one in high-dimensional problems, posterior probabilities calculated from the ob ... Full text Open Access Cite

Bayesian methods for analysis and adaptive scheduling of exoplanet observations

Journal Article Statistical Methodology · January 1, 2012 We describe work in progress by a collaboration of astronomers and statisticians developing a suite of Bayesian data analysis tools for extrasolar planet (exoplanet) detection, planetary orbit estimation, and adaptive scheduling of observations. Our work a ... Full text Cite

Rao-blackwellization for Bayesian variable selection and model averaging in linear and binary regression: A novel data augmentation approach

Journal Article Journal of the American Statistical Association · October 21, 2011 Featured Publication Choosing the subset of covariates to use in regression or generalized linear models is a ubiquitous problem. The Bayesian paradigm addresses the problem of model uncertainty by considering models corresponding to all possible subsets of the covariates, whe ... Full text Cite

Bayesian adaptive sampling for variable selection and model averaging

Journal Article Journal of Computational and Graphical Statistics · March 1, 2011 For the problem of model choice in linear regression, we introduce a Bayesian adaptive sampling algorithm (BAS), that samples models without replacement from the space of models. For problems that permit enumeration of all models, BAS is guaranteed to enum ... Full text Cite

Generalized beta mixtures of Gaussians

Conference Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 · January 1, 2011 In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better pro ... Cite

Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels

Journal Article Annals of Statistics · 2011 Featured Publication This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global fe ... Full text Open Access Cite

Generalized Beta Mixtures of Gaussians

Journal Article Advances in Neural Information Processing Systems · 2011 In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better pro ... Cite

Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy

Journal Article The Annals of Applied Statistics · 2011 We present a novel nonparametric Bayesian approach based on Lévy Adaptive Regression Kernels (LARK) to model spectral data arising from MALDI-TOF (Matrix Assisted Laser Desorption Ionization Time-of-Flight) mass spectrometry. This model-based approach prov ... Full text Open Access Link to item Cite

BAYESIAN MODEL SEARCH AND MULTILEVEL INFERENCE FOR SNP ASSOCIATION STUDIES.

Journal Article Ann Appl Stat · September 1, 2010 Technological advances in genotyping have given rise to hypothesis-based association studies of increasing scope. As a result, the scientific hypotheses addressed by these studies have become more complex and more difficult to address using existing analyt ... Full text Open Access Link to item Cite

Association between DNA damage response and repair genes and risk of invasive serous ovarian cancer.

Journal Article PLoS One · April 8, 2010 BACKGROUND: We analyzed the association between 53 genes related to DNA repair and p53-mediated damage response and serous ovarian cancer risk using case-control data from the North Carolina Ovarian Cancer Study (NCOCS), a population-based, case-control st ... Full text Open Access Link to item Cite

Do serum biomarkers really measure breast cancer?

Journal Article BMC Cancer · May 28, 2009 BACKGROUND: Because screening mammography for breast cancer is less effective for premenopausal women, we investigated the feasibility of a diagnostic blood test using serum proteins. METHODS: This study used a set of 98 serum proteins and chose diagnostic ... Full text Link to item Cite

Single nucleotide polymorphisms in the TP53 region and susceptibility to invasive epithelial ovarian cancer.

Journal Article Cancer Res · March 15, 2009 The p53 protein is critical for multiple cellular functions including cell growth and DNA repair. We assessed whether polymorphisms in the region encoding TP53 were associated with risk of invasive ovarian cancer. The study population includes a total of 5 ... Full text Link to item Cite

Statistical methods for automated drug susceptibility testing: Bayesian minimum inhibitory concentration prediction from growth curves

Journal Article Annals of Applied Statistics · March 1, 2009 Determination of the minimum inhibitory concentration (MIC) of a drug that prevents microbial growth is an important step for managing patients with infections. In this paper we present a novel probabilistic approach that accurately estimates MICs based on ... Full text Cite

Bayesian function estimation using continuous wavelet dictionaries

Journal Article Statistica Sinica · 2009 We present a Bayesian approach for nonparametric function estimation based on a continuous wavelet dictionary, where the unknown function is modeled by a random sum of wavelet functions at arbitrary locations and scales. By avoiding the dyadic constraints ... Link to item Cite

Polymorphism in the IL18 gene and epithelial ovarian cancer in non-Hispanic white women.

Journal Article Cancer Epidemiol Biomarkers Prev · December 2008 Over 22,000 cases of ovarian cancer were diagnosed in 2007 in the United States, but only a fraction of them can be attributed to mutations in highly penetrant genes such as BRCA1. To determine whether low-penetrance genetic variants contribute to ovarian ... Full text Link to item Cite

Mixtures of g-priors for Bayesian Variable Selection

Journal Article Journal of the American Statistical Association · 2008 Featured Publication Zellner's g prior remains a popular conventional prior for use in Bayesian variable selection, despite several undesirable consistency issues. In this article we study mixtures of g priors as an alternative to default g priors that resolve many of the prob ... Full text Link to item Cite

Nonparametric Function Estimation using Overcomplete Dictionaries (with Discussion)

Chapter · 2007 We consider the nonparametric regression problem of estimating an unknown function based on noisy data. One approach to this estimation problem is to represent the function in a series expansion using a linear combination of basis functions. Overcomplete d ... Cite

Nonparametric Function Estimation using Overcomplete Dictionaries (with Discussion)

Chapter · 2007 We consider the nonparametric regression problem of estimating an unknown function based on noisy data. One approach to this estimation problem is to represent the function in a series expansion using a linear combination of basis functions. Overcomplete d ... Cite

Current challenges in Bayesian model choice

Journal Article Statistical Challenges in Modern Astronomy IV · 2007 Link to item Cite

Clinical response to varying the stimulus parameters in deep brain stimulation for essential tremor.

Journal Article Movement disorders : official journal of the Movement Disorder Society · November 2006 Deep brain stimulation (DBS) of the ventral intermediate nucleus of the thalamus for essential tremor is sometimes limited by side effects. The mechanisms by which DBS alleviates tremor or causes side effects are unclear; thus, it is difficult to select st ... Full text Cite

Nonparametric Models for Proteomic Peak Identification and Quantification

Chapter · 2006 We present model-based inference for proteomic peak identification and quantification from mass spectroscopy data, focusing on nonparametric Bayesian models. Using experimental data generated from MALDI-TOF mass spectroscopy (matrix-assisted laser desorpti ... Full text Cite

Bayesian Identification of Differential Gene Expression Induced by Metals in Human Bronchial Epithelial Cells

Journal Article Bayesisan Analysis · 2005 The study of genetics continues to advance dramatically with the development of microarray technology. In light of the advancements, interesting statistical challenges have arisen. Given that only one observation can be made from each gene on a single arra ... Full text Cite

Minimum energy single-shock internal atrial defibrillation in sheep.

Journal Article J Interv Card Electrophysiol · April 2004 Well-tolerated internal atrial defibrillation shocks must be below the pain threshold, which has been estimated to be less than 1 Joule. Defibrillation of the atria with low energy is made possible by delivering shocks at the low end of the defibrillation ... Full text Link to item Cite

Model uncertainty

Journal Article Statistical Science · February 1, 2004 The evolution of Bayesian approaches for model uncertainty over the past decade has been remarkable. Catalyzed by advances in methods and technology for posterior computation, the scope of these methods has widened substantially. Major thrusts of these dev ... Full text Cite

Lossless online Bayesian bagging

Journal Article Journal of Machine Learning Research · 2004 © 2004 Herbert K. H. Lee and Merlise A. Clyde. Bagging frequently improves the predictive performance of a model. An online version has recently been introduced, which attempts to gain the benefits of an online algorithm while approximating regular bagging ... Open Access Link to item Cite

Model Averaging

Chapter · 2003 Cite

Health effects of air pollution: A statistical review

Journal Article International Statistical Review · January 1, 2003 We critically review and compare epidemiological designs and statistical approaches to estimate associations between air pollution and health. More specifically, we aim to address the following questions: 1. Which epidemiological designs and statistical me ... Full text Cite

Constrained design strategies for improving normal approximations in nonlinear regression problems

Journal Article Journal of Statistical Planning and Inference · May 1, 2002 In nonlinear regression problems, the assumption is usually made that parameter estimates will be approximately normally distributed. The accuracy of the approximation depends on the sample size and also on the intrinsic and parameter-effects curvatures. B ... Full text Cite

Experimental Design: Bayesian Designs

Chapter · 2001 This article provides an overview of experimental design using a Bayesian decision-theoretic framework. Scientific experimentation requires decisions about how an experiment will be conducted and analyzed. Such decisions depend on the goals and purpose of ... Full text Link to item Cite

Bagging and the Bayesian Bootstrap

Journal Article Artificial Intelligence and Statistics · 2001 Bagging is a method of obtaining more ro- bust predictions when the model class under consideration is unstable with respect to the data, i.e., small changes in the data can cause the predicted values to change significantly. In this paper, we introduce a ... Open Access Link to item Cite

Model uncertainty and health effect studies for particulate matter

Journal Article Environmetrics · December 1, 2000 There are many aspects of model choice that are involved in health effect studies of particulate matter and other pollutants. Some of these choices concern which pollutants and confounding variables should be included in the model, what type of lag structu ... Full text Cite

Flexible empirical Bayes estimation for wavelets

Journal Article Journal of the Royal Statistical Society. Series B: Statistical Methodology · January 1, 2000 Wavelet shrinkage estimation is an increasingly popular method for signal denoising and compression. Although Bayes estimators can provide excellent mean-squared error (MSE) properties, the selection of an effective prior is a difficult task. To address th ... Full text Cite

Conjugate analysis of multivariate normal data with incomplete observations

Journal Article Canadian Journal of Statistics · January 1, 2000 The authors discuss prior distributions that are conjugate to the multivariate normal likelihood when some of the observations are incomplete. They present a general class of priors for incorporating information about unidentified parameters in the covaria ... Full text Cite

Accounting for model uncertainty in prediction of chlorophyll a in Lake Okeechobee

Journal Article Journal of Agricultural, Biological, and Environmental Statistics · January 1, 2000 Long-term eutrophication data along with water quality measurements (total phosphorous and total nitrogen) and other physical environmental factors such as lake level (stage), water temperature, wind speed, and direction were used to develop a model to pre ... Full text Cite

Empirical Bayes estimation in wavelet nonparametric regression

Chapter · 1999 Bayesian methods based on hierarchical mixture models have demonstrated excellent mean squared error properties in constructing data dependent shrinkage estimators in wavelets, however, subjective elicitation of the hyperparameters is challenging. In this ... Full text Link to item Cite

Comment

Journal Article Statistical Science · January 1, 1999 Cite

Sequential importance sampling for nonparametric {B}ayes models: {T}he next generation

Journal Article The Canadian Journal of Statistics · 1999 Cite

Protein construct storage: Bayesian variable selection and prediction with mixtures.

Journal Article Journal of biopharmaceutical statistics · July 1998 Determining optimal conditions for protein storage while maintaining a high level of protein activity is an important question in pharmaceutical research. A designed experiment based on a space-filling design was conducted to understand the effects of fact ... Full text Cite

Mixture models in the exploration of structure-activity relationships in drug design

Chapter · 1998 We report on a study of mixture modeling problems arising in the assessment of chemical structure-activity relationships in drug design and discovery. Pharmaceutical research laboratories developing test compounds for screening synthesize many related cand ... Full text Cite

Multiple shrinkage and subset selection in wavelets

Journal Article Biometrika · January 1, 1998 This paper discusses Bayesian methods for multiple shrinkage estimation in wavelets. Wavelets are used in applications for data denoising, via shrinkage of the coefficients towards zero, and for data compression, by shrinkage and setting small coefficients ... Full text Cite

Strategies for Model Mixing in Generalized Linear Models

Journal Article Artificial Intelligence and Statistics · 1997 Cite

Strategies for Model Mixing in Generalized Linear Models

Journal Article Artificial Intelligence and Statistics · 1997 Cite

The Equivalence of Constrained and Weighted Designs in Multiple Objective Design Problems

Journal Article Journal of the American Statistical Association · September 1996 Full text Cite

Prediction Via Orthogonalized Model Mixing

Journal Article Journal of the American Statistical Association · September 1996 Full text Cite

The equivalence of constrained and weighted designs in multiple objective design problems

Journal Article Journal of the American Statistical Association · September 1, 1996 Several competing objectives may be relevant in the design of an experiment. The competing objectives may not be easy to characterize in a single optimality criterion. One approach to these design problems has been to weight each criterion and find the des ... Full text Cite

Inference and design strategies for a hierarchical logistic regression model

Chapter · 1996 This chapter focuses on Bayesian inference and design in binary regression experiments . As a case study we consider heart de brillator experiments in which the number of observations that can be taken is limited and it is important to incorporate all a ... Link to item Cite

Inference and design strategies for a hierarchical logistic regression model

Chapter · 1996 This chapter focuses on Bayesian inference and design in binary regression experiments . As a case study we consider heart de brillator experiments in which the number of observations that can be taken is limited and it is important to incorporate all a ... Link to item Cite

Orthogonalizations and Prior Distributions for Orthogonalized Model Mixing

Chapter · 1996 Prediction methods based on mixing over a set of plausible models can help alleviate the sensitivity of inference and decisions to modeling assumptions. One important application area is prediction in linear models. Computing techniques for model mixing in ... Full text Cite

Prediction via orthogonalized model mixing

Journal Article Journal of the American Statistical Association · 1996 We introduce an approach and algorithms for model mixing in large prediction problems with correlated predictors. We focus on the choice of predictors in linear models, and mix over possible subsets of candidate predictors. Our approach is based on express ... Full text Cite

Bayesian Designs for Approximate Normality

Chapter · 1995 In many experimental design problems, the primary interest is in estimating functions of the parameters and a design is selected according to some optimality criterion. The assumption that parameter estimates are approximately normally distributed is often ... Full text Link to item Cite

Optimal Design for Heart Defibrillators

Chapter · 1995 During heart defibrillator implantation, a physician fibrillates the patient’s heart several times at different test strengths to estimate the effective strength necessary for defibrillation. One strategy is to implant at the strength that de-fibrillates 9 ... Full text Link to item Cite

A Comparison of Algorithms for Sampling Models

Conference Proceedings of the 1994 Joint Statistical Meetings; Section on Bayesian Statistical Science · 1994 Link to item Cite

Logistic regression for spatial pair-potential models

Chapter · 1991 The spatial models considered in this paper are Gibbs processes with pairwise interaction potentials, which provide a rich framework for models where the likelihood of a particular configuration of points depends on attraction or repulsion between neighbor ... Full text Cite

Logistic regression for spatial pair-potential models

Chapter · 1991 The spatial models considered in this paper are Gibbs processes with pairwise interaction potentials, which provide a rich framework for models where the likelihood of a particular configuration of points depends on attraction or repulsion between neighbor ... Full text Cite

Geographic patterns of variation in allozymes and height growth in white spruce

Journal Article Canadian Journal of Forest Research · January 1, 1991 Variation in height at ages nine and 19 and at six polymorphic allozyme loci was examined for 22 seed sources in a range-wide Picea glauca provenance test. -from Authors ... Full text Cite

EFFECTS OF AVIAN SEED DISPERSAL ON THE GENETIC STRUCTURE OF WHITEBARK PINE POPULATIONS.

Journal Article Evolution; international journal of organic evolution · May 1987 We used allozyme analysis to examine family structure, the spatial patterning of related individuals, in two populations of whitebark pine (Pinus albicaulis), a subalpine conifer that commonly displays a multistem form. The individual stems within clumps a ... Full text Cite

Effects of avian seed dispersal on the genetic structure of whitebark pine populations.

Journal Article Evolution · January 1, 1987 Pinus albicaulis is a subalpine conifer that commonly displays a multistem form. The individual stems within clumps are genetically distinct individuals, having arisen from separate seeds. Individuals within a clump are genetically more similar than indivi ... Full text Cite