Scholars@Duke publication: The evaluation of the performance of ChatGPT in the management of labor analgesia.

The evaluation of the performance of ChatGPT in the management of labor analgesia.

Publication , Journal Article

Ismaiel, N; Nguyen, TP; Guo, N; Carvalho, B; Sultan, P; study collaborators

Published in: J Clin Anesth

November 2024

UNLABELLED: ChatGPT4 is a leading large language model (LLM) chatbot released by OpenAI in 2023. ChatGPT4 can respond to free-text queries, answer questions and make suggestions regarding virtually any topic. ChatGPT4 has successfully answered anesthesia and even obstetric anesthesia knowledge-based questions with reasonable accuracy. However, ChatGPT4 has yet to be challenged in obstetric anesthesia clinical decision-making. STUDY OBJECTIVE: In this study, we evaluated the performance of ChatGPT4 in the management of clinical labor analgesia scenarios compared to expert obstetric anesthesiologists. INTERVENTION: Eight clinical questions with progressively increasing medical complexity were posed to ChatGPT4. MEASUREMENTS: The ChatGPT4 responses were rated by seven expert obstetric anesthesiologists based on safety, accuracy and completeness of each response using a five-point Likert rating scale. MAIN RESULTS: ChatGPT4 was deemed safe in 73% of responses to the presented obstetric anesthesia clinical scenarios (27% of responses were deemed unsafe). None of the ChatGPT4 responses were unanimously deemed to be safe by all seven expert obstetric anesthesiologists. Moreover, ChatGPT4 responses were overall partly accurate (score 4 out of 5) and somewhat incomplete (score 3.5 out of 5). CONCLUSIONS: In summary, approximately one quarter of all responses by ChatGPT4 were deemed unsafe by expert obstetric anesthesiologists. These findings may suggest the need for more fine-tuning and training of LLMs such as ChatGPT4 specifically for clinical decision making in obstetric anesthesia or other specialized medical fields. These LLMs may come to play an important future role in assisting obstetric anesthesiologists in clinical decision making and enhancing overall patient care.

Duke Scholars

Author Ashraf Samir Habib Anesthesiology, Women's

Published In

J Clin Anesth

DOI

10.1016/j.jclinane.2024.111582

EISSN

1873-4529

Publication Date

November 2024

Volume

Start / End Page

111582

Location

United States

Related Subject Headings

Pain Management
Machine Learning
Labor Pain
Humans
Female
Anesthesiology
Analgesia, Obstetrical
3202 Clinical sciences
1103 Clinical Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Ismaiel, N., Nguyen, T. P., Guo, N., Carvalho, B., Sultan, P., & study collaborators. (2024). The evaluation of the performance of ChatGPT in the management of labor analgesia. J Clin Anesth, 98, 111582. https://doi.org/10.1016/j.jclinane.2024.111582

Ismaiel, Nada, Teresa Phuongtram Nguyen, Nan Guo, Brendan Carvalho, Pervez Sultan, and study collaborators. “The evaluation of the performance of ChatGPT in the management of labor analgesia.” J Clin Anesth 98 (November 2024): 111582. https://doi.org/10.1016/j.jclinane.2024.111582.

Ismaiel N, Nguyen TP, Guo N, Carvalho B, Sultan P, study collaborators. The evaluation of the performance of ChatGPT in the management of labor analgesia. J Clin Anesth. 2024 Nov;98:111582.

Ismaiel, Nada, et al. “The evaluation of the performance of ChatGPT in the management of labor analgesia.” J Clin Anesth, vol. 98, Nov. 2024, p. 111582. Pubmed, doi:10.1016/j.jclinane.2024.111582.

Ismaiel N, Nguyen TP, Guo N, Carvalho B, Sultan P, study collaborators. The evaluation of the performance of ChatGPT in the management of labor analgesia. J Clin Anesth. 2024 Nov;98:111582.

Published In

J Clin Anesth

DOI

10.1016/j.jclinane.2024.111582

EISSN

1873-4529

Publication Date

November 2024

Volume

Start / End Page

111582

Location

United States

Related Subject Headings

Pain Management
Machine Learning
Labor Pain
Humans
Female
Anesthesiology
Analgesia, Obstetrical
3202 Clinical sciences
1103 Clinical Sciences