Overview
My research focuses on developing new tools for probabilistic learning from complex data - methods development is directly motivated by challenging applications in ecology/biodiversity, neuroscience, environmental health, criminal justice/fairness, and more. We seek to develop new modeling frameworks, algorithms and corresponding code that can be used routinely by scientists and decision makers. We are also interested in new inference framework and in studying theoretical properties of methods we develop.
Some highlight application areas:
(1) Modeling of biological communities and biodiversity - we are considering global data on fungi, insects, birds and animals including DNA sequences, images, audio, etc. Data contain large numbers of species unknown to science and we would like to learn about these new species, community network structure, and the impact of environmental change and climate.
(2) Brain connectomics - based on high resolution imaging data of the human brain, we are seeking to developing new statistical and machine learning models for relating brain networks to human traits and diseases.
(3) Environmental health & mixtures - we are building tools for relating chemical and other exposures (air pollution etc) to human health outcomes, accounting for spatial dependence in both exposures and disease. This includes an emphasis on infectious disease modeling, such as COVID-19.
Some statistical areas that play a prominent role in our methods development include models for low-dimensional structure in data (latent factors, clustering, geometric and manifold learning), flexible/nonparametric models (neural networks, Gaussian/spatial processes, other stochastic processes), Bayesian inference frameworks, efficient sampling and analytic approximation algorithms, and models for "object data" (trees, networks, images, spatial processes, etc).
Some highlight application areas:
(1) Modeling of biological communities and biodiversity - we are considering global data on fungi, insects, birds and animals including DNA sequences, images, audio, etc. Data contain large numbers of species unknown to science and we would like to learn about these new species, community network structure, and the impact of environmental change and climate.
(2) Brain connectomics - based on high resolution imaging data of the human brain, we are seeking to developing new statistical and machine learning models for relating brain networks to human traits and diseases.
(3) Environmental health & mixtures - we are building tools for relating chemical and other exposures (air pollution etc) to human health outcomes, accounting for spatial dependence in both exposures and disease. This includes an emphasis on infectious disease modeling, such as COVID-19.
Some statistical areas that play a prominent role in our methods development include models for low-dimensional structure in data (latent factors, clustering, geometric and manifold learning), flexible/nonparametric models (neural networks, Gaussian/spatial processes, other stochastic processes), Bayesian inference frameworks, efficient sampling and analytic approximation algorithms, and models for "object data" (trees, networks, images, spatial processes, etc).
Current Appointments & Affiliations
Arts and Sciences Distinguished Professor of Statistical Science
·
2013 - Present
Statistical Science,
Trinity College of Arts & Sciences
Professor of Statistical Science
·
2008 - Present
Statistical Science,
Trinity College of Arts & Sciences
Professor in the Department of Mathematics
·
2014 - Present
Mathematics,
Trinity College of Arts & Sciences
Faculty Network Member of the Duke Institute for Brain Sciences
·
2011 - Present
Duke Institute for Brain Sciences,
University Institutes and Centers
Recent Publications
Accelerated algorithms for convex and non-convex optimization on manifolds
Journal Article Machine Learning · March 1, 2025 We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we “convexify” the objective function ... Full text CiteLOW-RANK LONGITUDINAL FACTOR REGRESSION WITH APPLICATION TO CHEMICAL MIXTURES.
Journal Article The annals of applied statistics · March 2025 Developmental epidemiology commonly focuses on assessing the association between multiple early life exposures and childhood health. Statistical analyses of data from such studies focus on inferring the contributions of individual exposures, while also cha ... Full text CiteINFERRING SYNERGISTIC AND ANTAGONISTIC INTERACTIONS IN MIXTURES OF EXPOSURES
Journal Article Annals of Applied Statistics · March 1, 2025 There is abundant interest in assessing the joint effects of multiple exposures on human health. This is often referred to as the mixtures problem in environmental epidemiology and toxicology. Classically, studies have examined the adverse health effects o ... Full text CiteRecent Grants
Duke University Program in Environmental Health
Inst. Training Prgm or CMEMentor · Awarded by National Institutes of Health · 2019 - 2029Improving inferences on health effects of chemical exposures
ResearchPrincipal Investigator · Awarded by National Institute of Environmental Health Sciences · 2023 - 2028R01: Genetic Origins of Adverse Outcomes in African Americans with Lymphoma
ResearchCo Investigator · Awarded by National Institutes of Health · 2023 - 2028View All Grants
Education, Training & Certifications
Emory University ·
1997
Ph.D.
Pennsylvania State University ·
1994
B.S.