The role of generative artificial intelligence in deciding fusion treatment of lumbar degeneration: a comparative analysis and narrative review.
PURPOSE: This study analyzed responses and readability of generative artificial intelligence (AI) models to questions and recommendations from the 2014 Journal of Neurosurgery: Spine (JNS) guidelines for fusion procedures in the treatment of degenerative lumbar spine disease. METHODS: Twenty-four questions were generated from JNS guidelines and asked to ChatGPT 4o, Perplexity, Microsoft Copilot, and Gemini. Answers were "concordant" if the response highlighted all points from the JNS guidelines; otherwise, answers were considered "non-concordant" and further sub-categorized as either "insufficient" or "overconclusive." Responses were evaluated for readability via the Flesch-Kincaid Grade Level, Gunning Fog Index, Simple Measure of Gobbledygook (SMOG) Index, and Flesch Reading Ease test. RESULTS: ChatGPT 4o had the highest concordance rate at 66.67%, with non-concordant responses distributed at 16.67% for both insufficient and over-conclusive classifications. Perplexity displayed a 58.33% concordance rate, with 25% insufficient and 16.67% over-conclusive responses. Copilot showed 50% concordance, with 37.5% over-conclusive and 16.67% insufficient responses. Gemini demonstrated 54.17% concordance, with 20.83% insufficient and 25% over-conclusive responses. The Flesch-Kincaid Grade Level scores ranged from 14.03 (Copilot) to 15.66 (Perplexity). The Gunning Fog Index scores varied between 15.15 (Copilot) and 18.13 (Perplexity). The SMOG Index scores ranged from 14.69 (Copilot) to 16.49 (Perplexity). The Flesch Reading Ease scores were low across all models, with Copilot showing the highest score of 20.71. CONCLUSIONS: ChatGPT 4.0 emerged as the best-performing model in terms of concordance, while Perplexity displayed the highest complexity in text readability. AI can be a valuable adjunct in clinical decision-making but cannot replace clinician judgment.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Spinal Fusion
- Practice Guidelines as Topic
- Orthopedics
- Lumbar Vertebrae
- Intervertebral Disc Degeneration
- Humans
- Generative Artificial Intelligence
- Clinical Decision-Making
- Artificial Intelligence
- 4201 Allied health and rehabilitation science
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Spinal Fusion
- Practice Guidelines as Topic
- Orthopedics
- Lumbar Vertebrae
- Intervertebral Disc Degeneration
- Humans
- Generative Artificial Intelligence
- Clinical Decision-Making
- Artificial Intelligence
- 4201 Allied health and rehabilitation science