Scholars@Duke publication: Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).

Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).

Publication , Journal Article

Forero, DA; Abreu, SE; Tovar, BE; Oermann, MH

Published in: Journal of medical systems

June 2025

In the context of Evidence-Based Practice (EBP), Systematic Reviews (SRs), Meta-Analyses (MAs) and overview of reviews have become cornerstones for the synthesis of research findings. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 and Preferred Reporting Items for Overviews of Reviews (PRIOR) statements have become major reporting guidelines for SRs/MAs and for overviews of reviews, respectively. In recent years, advances in Generative Artificial Intelligence (genAI) have been proposed as a potential major paradigm shift in scientific research. The main aim of this research was to examine the performance of four LLMs for the analysis of adherence to PRISMA 2020 and PRIOR, in a sample of 20 SRs and 20 overviews of reviews. We tested the free versions of four commonly used LLMs: ChatGPT (GPT-4o), DeepSeek (V3), Gemini (2.0 Flash) and Qwen (2.5 Max). Adherence to PRISMA 2020 and PRIOR was compared with scores defined previously by human experts, using several statistical tests. In our results, all the four LLMs showed a low performance for the analysis of adherence to PRISMA 2020, overestimating the percentage of adherence (from 23 to 30%). For PRIOR, the LLMs presented lower differences in the estimation of adherence (from 6 to 14%) and ChatGPT showed a performance similar to human experts. This is the first report of the performance of four commonly used LLMs for the analysis of adherence to PRISMA 2020 and PRIOR. Future studies of adherence to other reporting guidelines will be helpful in health sciences research.

Duke Scholars

Author Marilyn Haag Oermann School of Nursing

Published In

Journal of medical systems

DOI

10.1007/s10916-025-02212-0

EISSN

1573-689X

ISSN

0148-5598

Publication Date

June 2025

Volume

Issue

Start / End Page

Related Subject Headings

Systematic Reviews as Topic
Meta-Analysis as Topic
Medical Informatics
Large Language Models
Humans
Guidelines as Topic
Guideline Adherence
Artificial Intelligence
4203 Health services and systems
1117 Public Health and Health Services

Citation

APA

Chicago

ICMJE

MLA

NLM

Forero, D. A., Abreu, S. E., Tovar, B. E., & Oermann, M. H. (2025). Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR). Journal of Medical Systems, 49(1), 80. https://doi.org/10.1007/s10916-025-02212-0

Forero, Diego A., Sandra E. Abreu, Blanca E. Tovar, and Marilyn H. Oermann. “Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).” Journal of Medical Systems 49, no. 1 (June 2025): 80. https://doi.org/10.1007/s10916-025-02212-0.

Forero DA, Abreu SE, Tovar BE, Oermann MH. Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR). Journal of medical systems. 2025 Jun;49(1):80.

Forero, Diego A., et al. “Large Language Models and the Analyses of Adherence to Reporting Guidelines in Systematic Reviews and Overviews of Reviews (PRISMA 2020 and PRIOR).” Journal of Medical Systems, vol. 49, no. 1, June 2025, p. 80. Epmc, doi:10.1007/s10916-025-02212-0.

Published In

Journal of medical systems

DOI

10.1007/s10916-025-02212-0

EISSN

1573-689X

ISSN

0148-5598

Publication Date

June 2025

Volume

Issue

Start / End Page

Related Subject Headings

Systematic Reviews as Topic
Meta-Analysis as Topic
Medical Informatics
Large Language Models
Humans
Guidelines as Topic
Guideline Adherence
Artificial Intelligence
4203 Health services and systems
1117 Public Health and Health Services