Skip to main content

Fan Li

Professor of Statistical Science
Statistical Science
Box 90251, Durham, NC 27708-0251
122 Old Chem Bldg, Durham, NC 27708

Overview


My main research interest is causal inference and its applications to health, policy and social science. I also work on the interface between causal inference and machine learning. I have developed methods for propensity score, clinical trials, randomized experiments (e.g. A/B testing), difference-in-differences, regression discontinuity designs, representation learning. I also work on Bayesian analysis and statistical methods for missing data. I am serving as the editor for social science, biostatistics and policy for the journalĀ Annals of Applied Statistics.

Current Appointments & Affiliations


Professor of Statistical Science · 2021 - Present Statistical Science, Trinity College of Arts & Sciences
Professor of Biostatistics & Bioinformatics · 2021 - Present Biostatistics & Bioinformatics, Division of Biostatistics, Biostatistics & Bioinformatics

In the News


Published September 28, 2021
Fan Li: Using Math to Help Physicians Make Better COVID Treatment Decisions

View All News

Recent Publications


Application of unified health large language model evaluation framework to In-Basket message replies: bridging qualitative and quantitative assessments.

Journal Article J Am Med Inform Assoc · April 1, 2025 OBJECTIVES: Large language models (LLMs) are increasingly utilized in healthcare, transforming medical practice through advanced language processing capabilities. However, the evaluation of LLMs predominantly relies on human qualitative assessment, which i ... Full text Link to item Cite

Development of a natural language processing algorithm to extract social determinants of health from clinician notes.

Journal Article Am J Transplant · March 6, 2025 Disparities in access to the organ transplant waitlist are well-documented, but research into modifiable factors has been limited due to a lack of access to organized prewaitlisting data. This study aimed to develop a natural language processing (NLP) algo ... Full text Link to item Cite

Random Survival Forest Machine Learning for the Prediction of Cardiovascular Events Among Patients With a Measured Lipoprotein(a) Level: A Model Development Study.

Journal Article Circ Genom Precis Med · February 2025 BACKGROUND: Established risk models may not be applicable to patients at higher cardiovascular risk with a measured Lp(a) (lipoprotein[a]) level, a causal risk factor for atherosclerotic cardiovascular disease. METHODS: This was a model development study. ... Full text Link to item Cite
View All Publications

Recent Grants


Deprescribing Decision-Making using Machine Learning Individualized Treatment Rules to Improve CNS Polypharmacy

ResearchCo Investigator · Awarded by National Institutes of Health · 2024 - 2029

Principal stratification methods and software for intercurrent events in clinical trials

ResearchPrincipal Investigator · Awarded by University of North Carolina - Chapel Hill · 2023 - 2028

Innovative Biostatistical Methods for Analysis and Assessment of Clinical Trials Augmented by Real World Data

ResearchCo Investigator · Awarded by Burroughs Wellcome Fund · 2021 - 2026

View All Grants

Education, Training & Certifications


Johns Hopkins University · 2006 Ph.D.
Peking University (China) · 2001 B.S.

External Links


Personal site