Skip to main content
Journal cover image

Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations.

Publication ,  Journal Article
Yang, R; Tong, J; Wang, H; Huang, H; Hu, Z; Li, P; Liu, N; Lindsell, CJ; Pencina, MJ; Chen, Y; Hong, C
Published in: J Am Med Inform Assoc
August 31, 2025

OBJECTIVES: Systematic reviews in comparative effectiveness research require timely evidence synthesis. With the rapid advancement of medical research, preprint articles play an increasingly important role in accelerating knowledge dissemination. However, as preprint articles are not peer-reviewed before publication, their quality varies significantly, posing challenges for evidence inclusion in systematic reviews. MATERIALS AND METHODS: We developed AutoConfidenceScore (automated confidence score assessment), an advanced framework for predicting preprint publication, which reduces reliance on manual curation and expands the range of predictors, including three key advancements: (1) automated data extraction using natural language processing techniques, (2) semantic embeddings of titles and abstracts, and (3) large language model (LLM)-driven evaluation scores. Additionally, we employed two prediction models: a random forest classifier for binary outcome and a survival cure model that predicts both binary outcome and publication risk over time. RESULTS: The random forest classifier achieved an area under the receiver operating characteristic curve (AUROC) of 0.747 using all features. The survival cure model achieved an AUROC of 0.731 for binary outcome prediction and a concordance index of 0.667 for time-to-publication risk. DISCUSSION: Our study advances the framework for preprint publication prediction through automated data extraction and multiple feature integration. By combining semantic embeddings with LLM-driven evaluations, AutoConfidenceScore significantly enhances predictive performance while reducing manual annotation burden. CONCLUSION: AutoConfidenceScore has the potential to facilitate incorporation of preprint articles during the appraisal phase of systematic reviews, supporting researchers in more effective utilization of preprint resources.

Duke Scholars

Published In

J Am Med Inform Assoc

DOI

EISSN

1527-974X

Publication Date

August 31, 2025

Location

England

Related Subject Headings

  • Medical Informatics
  • 46 Information and computing sciences
  • 42 Health sciences
  • 32 Biomedical and clinical sciences
  • 11 Medical and Health Sciences
  • 09 Engineering
  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Yang, R., Tong, J., Wang, H., Huang, H., Hu, Z., Li, P., … Hong, C. (2025). Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations. J Am Med Inform Assoc. https://doi.org/10.1093/jamia/ocaf137
Yang, Rui, Jiayi Tong, Haoyuan Wang, Hui Huang, Ziyang Hu, Peiyu Li, Nan Liu, et al. “Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations.J Am Med Inform Assoc, August 31, 2025. https://doi.org/10.1093/jamia/ocaf137.
Yang R, Tong J, Wang H, Huang H, Hu Z, Li P, et al. Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations. J Am Med Inform Assoc. 2025 Aug 31;
Yang, Rui, et al. “Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations.J Am Med Inform Assoc, Aug. 2025. Pubmed, doi:10.1093/jamia/ocaf137.
Yang R, Tong J, Wang H, Huang H, Hu Z, Li P, Liu N, Lindsell CJ, Pencina MJ, Chen Y, Hong C. Enabling inclusive systematic reviews: incorporating preprint articles with large language model-driven evaluations. J Am Med Inform Assoc. 2025 Aug 31;
Journal cover image

Published In

J Am Med Inform Assoc

DOI

EISSN

1527-974X

Publication Date

August 31, 2025

Location

England

Related Subject Headings

  • Medical Informatics
  • 46 Information and computing sciences
  • 42 Health sciences
  • 32 Biomedical and clinical sciences
  • 11 Medical and Health Sciences
  • 09 Engineering
  • 08 Information and Computing Sciences