Overview
My main research interest is causal inference and its applications to health, policy and social science. I also work on the interface between causal inference and machine learning. I have developed methods for propensity score, clinical trials, randomized experiments (e.g. A/B testing), difference-in-differences, regression discontinuity designs, representation learning. I also work on Bayesian analysis and statistical methods for missing data. I am serving as the editor for social science, biostatistics and policy for the journalĀ Annals of Applied Statistics.
Current Appointments & Affiliations
Professor of Statistical Science
·
2021 - Present
Statistical Science,
Trinity College of Arts & Sciences
Professor of Biostatistics & Bioinformatics
·
2021 - Present
Biostatistics & Bioinformatics, Division of Biostatistics,
Biostatistics & Bioinformatics
Recent Publications
Application of unified health large language model evaluation framework to In-Basket message replies: bridging qualitative and quantitative assessments.
Journal Article J Am Med Inform Assoc · April 1, 2025 OBJECTIVES: Large language models (LLMs) are increasingly utilized in healthcare, transforming medical practice through advanced language processing capabilities. However, the evaluation of LLMs predominantly relies on human qualitative assessment, which i ... Full text Link to item CiteDevelopment of a natural language processing algorithm to extract social determinants of health from clinician notes.
Journal Article Am J Transplant · March 6, 2025 Disparities in access to the organ transplant waitlist are well-documented, but research into modifiable factors has been limited due to a lack of access to organized prewaitlisting data. This study aimed to develop a natural language processing (NLP) algo ... Full text Link to item CiteRandom Survival Forest Machine Learning for the Prediction of Cardiovascular Events Among Patients With a Measured Lipoprotein(a) Level: A Model Development Study.
Journal Article Circ Genom Precis Med · February 2025 BACKGROUND: Established risk models may not be applicable to patients at higher cardiovascular risk with a measured Lp(a) (lipoprotein[a]) level, a causal risk factor for atherosclerotic cardiovascular disease. METHODS: This was a model development study. ... Full text Link to item CiteRecent Grants
Deprescribing Decision-Making using Machine Learning Individualized Treatment Rules to Improve CNS Polypharmacy
ResearchCo Investigator · Awarded by National Institutes of Health · 2024 - 2029Principal stratification methods and software for intercurrent events in clinical trials
ResearchPrincipal Investigator · Awarded by University of North Carolina - Chapel Hill · 2023 - 2028Innovative Biostatistical Methods for Analysis and Assessment of Clinical Trials Augmented by Real World Data
ResearchCo Investigator · Awarded by Burroughs Wellcome Fund · 2021 - 2026View All Grants
Education, Training & Certifications
Johns Hopkins University ·
2006
Ph.D.
Peking University (China) ·
2001
B.S.