Scholars@Duke publication: Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model

Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model

Publication , Conference

Garcia-Alcoser, ME; Tushar, FI; Nejad, MG; Rubin, GD; Lo, JY

Published in: Progress in Biomedical Optics and Imaging Proceedings of SPIE

January 1, 2025

Natural language processing (NLP) methods can annotate free-text radiology reports to create large datasets at the scale of an entire health system or beyond. Generalizing the disease classification across multiple organ systems inherently requires a complex, robust, and accurate classification model. Concurrently, NLP methods have significantly improved and become more sophisticated. This study compares two traditional NLP methods, a rule-based algorithm (RBA) and a Bidirectional Long Short-Term Memory network (BiLSTM), with a lightweight variant of the Large Language Model Meta AI (Llama) model. Our goal is to analyze the capabilities and limitations of each model in accurately classifying diseases encountered within the chest, abdominal, and pelvic computed tomography (CT) exams of the body. Rule-based algorithms (RBAs) were used to extract disease labels from the “findings” section of CT radiology reports, creating the training, validation, and testing datasets. Disease labels were made for three organ systems: the lungs/pleura, liver/gallbladder, and kidneys/ureters. A BiLSTM network with an attention mechanism was trained on 151,431 cases and tested on 85,987 cases. The BiLSTM and Meta's Llama3.1-8B model was evaluated on the RBA-test set and a manually annotated dataset. On the smaller, manually labeled test set, the RBA model achieved the highest macro F1 score (0.94), followed by the BiLSTM (0.91) and then Llama (0.89). In contrast, on the larger RBA-labeled test set, the BiLSTM maintained high performance (average AUC > 0.98; macro F1 = 0.95), while Llama's macro F1 dropped to 0.65. Manual spot checking of reports where Llama disagreed with RBA/BiLSTM revealed numerous instances in which Llama was actually correct, indicating flaws with the previous RBA labeling. This study emphasizes the limitations of rule-based approaches and the need to consider clinical context in ambiguous scenarios. Llama3.1-8B exhibits the potential to outperform rule-based methods, indicating promise for reliable, large-scale multi-disease classification in CT text reports.

Duke Scholars

Author Joseph Yuan-Chieh Lo Radiology

Author Geoffrey D Rubin Radiology

Published In

Progress in Biomedical Optics and Imaging Proceedings of SPIE

DOI

10.1117/12.3047690

ISSN

1605-7422

Publication Date

January 1, 2025

Volume

13411

Citation

APA

Chicago

ICMJE

MLA

NLM

Garcia-Alcoser, M. E., Tushar, F. I., Nejad, M. G., Rubin, G. D., & Lo, J. Y. (2025). Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model. In Progress in Biomedical Optics and Imaging Proceedings of SPIE (Vol. 13411). https://doi.org/10.1117/12.3047690

Garcia-Alcoser, M. E., F. I. Tushar, M. G. Nejad, G. D. Rubin, and J. Y. Lo. “Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model.” In Progress in Biomedical Optics and Imaging Proceedings of SPIE, Vol. 13411, 2025. https://doi.org/10.1117/12.3047690.

Garcia-Alcoser ME, Tushar FI, Nejad MG, Rubin GD, Lo JY. Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model. In: Progress in Biomedical Optics and Imaging Proceedings of SPIE. 2025.

Garcia-Alcoser, M. E., et al. “Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model.” Progress in Biomedical Optics and Imaging Proceedings of SPIE, vol. 13411, 2025. Scopus, doi:10.1117/12.3047690.

Garcia-Alcoser ME, Tushar FI, Nejad MG, Rubin GD, Lo JY. Multi-disease Classification of CT Reports using Traditional Natural Language Processing and a Lightweight Foundation Model. Progress in Biomedical Optics and Imaging Proceedings of SPIE. 2025.

Published In

Progress in Biomedical Optics and Imaging Proceedings of SPIE

DOI

10.1117/12.3047690

ISSN

1605-7422

Publication Date

January 1, 2025

Volume

13411