Skip to main content

Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery.

Publication ,  Journal Article
Nieves-Lopez, B; Bechtle, AR; Traverse, J; Klifto, C; Schoch, BS; Aziz, KT
Published in: Orthopedics
2025

BACKGROUND: The purpose of this study was to evaluate the performance and evolution of Chat Generative Pre-Trained Transformer (ChatGPT; OpenAI) as a resource for shoulder and elbow surgery information by assessing its accuracy on the American Academy of Orthopaedic Surgeons shoulder-elbow self-assessment questions. We hypothesized that both ChatGPT models would demonstrate proficiency and that there would be significant improvement with progressive iterations. MATERIALS AND METHODS: A total of 200 questions were selected from the 2019 and 2021 American Academy of Orthopaedic Surgeons shoulder-elbow self-assessment questions. ChatGPT 3.5 and 4 were used to evaluate all questions. Questions with non-text data were excluded (114 questions). Remaining questions were input into ChatGPT and categorized as follows: anatomy, arthroplasty, basic science, instability, miscellaneous, nonoperative, and trauma. ChatGPT's performances were quantified and compared across categories with chi-square tests. The continuing medical education credit threshold of 50% was used to determine proficiency. Statistical significance was set at P<.05. RESULTS: ChatGPT 3.5 and 4 answered 52.3% and 73.3% of the questions correctly, respectively (P=.003). ChatGPT 3.5 performed significantly better in the instability category (P=.037). ChatGPT 4's performance did not significantly differ across categories (P=.841). ChatGPT 4 performed significantly better than ChatGPT 3.5 in all categories except instability and miscellaneous. CONCLUSION: ChatGPT 3.5 and 4 exceeded the proficiency threshold. ChatGPT 4 performed better than ChatGPT 3.5, showing an increased capability to correctly answer shoulder and elbow-focused questions. Further refinement of ChatGPT's training may improve its performance and utility as a resource. Currently, ChatGPT remains unable to answer questions at a high enough accuracy to replace clinical decision-making. [Orthopedics. 2025;48(2):e69-e74.].

Duke Scholars

Published In

Orthopedics

DOI

EISSN

1938-2367

Publication Date

2025

Volume

48

Issue

2

Start / End Page

e69 / e74

Location

United States

Related Subject Headings

  • Shoulder Joint
  • Shoulder
  • Orthopedics
  • Orthopedic Procedures
  • Humans
  • Generative Artificial Intelligence
  • Elbow Joint
  • Elbow
  • 3202 Clinical sciences
  • 1103 Clinical Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Nieves-Lopez, B., Bechtle, A. R., Traverse, J., Klifto, C., Schoch, B. S., & Aziz, K. T. (2025). Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery. Orthopedics, 48(2), e69–e74. https://doi.org/10.3928/01477447-20250123-03
Nieves-Lopez, Benjamin, Alexandra R. Bechtle, Jennifer Traverse, Christopher Klifto, Bradley S. Schoch, and Keith T. Aziz. “Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery.Orthopedics 48, no. 2 (2025): e69–74. https://doi.org/10.3928/01477447-20250123-03.
Nieves-Lopez B, Bechtle AR, Traverse J, Klifto C, Schoch BS, Aziz KT. Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery. Orthopedics. 2025;48(2):e69–74.
Nieves-Lopez, Benjamin, et al. “Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery.Orthopedics, vol. 48, no. 2, 2025, pp. e69–74. Pubmed, doi:10.3928/01477447-20250123-03.
Nieves-Lopez B, Bechtle AR, Traverse J, Klifto C, Schoch BS, Aziz KT. Evaluating the Evolution of ChatGPT as an Information Resource in Shoulder and Elbow Surgery. Orthopedics. 2025;48(2):e69–e74.

Published In

Orthopedics

DOI

EISSN

1938-2367

Publication Date

2025

Volume

48

Issue

2

Start / End Page

e69 / e74

Location

United States

Related Subject Headings

  • Shoulder Joint
  • Shoulder
  • Orthopedics
  • Orthopedic Procedures
  • Humans
  • Generative Artificial Intelligence
  • Elbow Joint
  • Elbow
  • 3202 Clinical sciences
  • 1103 Clinical Sciences