Skip to main content
Journal cover image

Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach.

Publication ,  Journal Article
Short, RG; Bralich, J; Bogaty, D; Befera, NT
Published in: J Digit Imaging
October 2019

Radiology reports contain a large amount of potentially valuable unstructured data. Recently, neural networks have been employed to perform classification of radiology reports over a few classes at the document level. The success of neural networks in sequence-labeling problems such as named entity recognition and part of speech tagging suggests that they could be used to classify radiology report text with greater granularity. We employed a neural network architecture to comprehensively classify mammography report text at the word level using a sequence labeling approach. Two radiologists devised a comprehensive classification system for screening mammography reports. Each word in each report was manually categorized by a radiologist into one of 33 categories according to the classification system. Tagged words referencing the same finding were grouped into unique sets. We pre-labeled reports with a rule-based algorithm and then manually edited these annotations for 6705 screening mammography reports (25.1%, 66.8%, and 8.1% BI-RADS 0, 1, and 2, respectively). A combined convolutional and recurrent neural network model was used to label words in each sentence of the individual reports. A siamese recurrent neural network was then used to group findings into sets. Performance of the neural network-based method was compared to a rule-based algorithm and a conditional random field (CRF) model. Global accuracy (percentage of documents where all word tags were predicted correctly) and keyword accuracy (percentage of all words that were labeled correctly, excluding words tagged as unimportant) were calculated on an unseen 519 report test set. Two-tailed t tests were used to assess differences between algorithm performance, and p < 0.05 was used to determine statistical significance. The neural network-based approach showed significantly higher global accuracy compared to both the rule-based algorithm (88.3 vs 57.0%, p < 0.001) and the CRF model (88.3% vs. 75.8%, p < 0.001). The neural network also showed significantly higher keyword level accuracy compared to the rule-based algorithm (95.5% vs. 80.9% p < 0.001) and CRF model (95.5% vs. 76.9%, p < 0.001). We demonstrate the potential of neural networks to accurately perform word-level multilabel classification of free text radiology reports across 33 classes, thus showing the utility of a sequence labeling approach to NLP of radiology reports. We found that a neural network classifier outperforms a rule-based algorithm and a CRF classifier for comprehensive multilabel classification of free text screening mammography reports at the word level. By approaching radiology report classification as a sequence-labeling problem, we demonstrate the ability of neural networks to extract data from free text radiology reports at a level of granularity not previously reported.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

J Digit Imaging

DOI

EISSN

1618-727X

Publication Date

October 2019

Volume

32

Issue

5

Start / End Page

685 / 692

Location

United States

Related Subject Headings

  • Research Report
  • Reproducibility of Results
  • Nuclear Medicine & Medical Imaging
  • Neural Networks, Computer
  • Mammography
  • Image Interpretation, Computer-Assisted
  • Humans
  • Female
  • Electronic Health Records
  • Databases, Factual
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Short, R. G., Bralich, J., Bogaty, D., & Befera, N. T. (2019). Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach. J Digit Imaging, 32(5), 685–692. https://doi.org/10.1007/s10278-018-0141-4
Short, Ryan G., John Bralich, Dave Bogaty, and Nicholas T. Befera. “Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach.J Digit Imaging 32, no. 5 (October 2019): 685–92. https://doi.org/10.1007/s10278-018-0141-4.
Short RG, Bralich J, Bogaty D, Befera NT. Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach. J Digit Imaging. 2019 Oct;32(5):685–92.
Short, Ryan G., et al. “Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach.J Digit Imaging, vol. 32, no. 5, Oct. 2019, pp. 685–92. Pubmed, doi:10.1007/s10278-018-0141-4.
Short RG, Bralich J, Bogaty D, Befera NT. Comprehensive Word-Level Classification of Screening Mammography Reports Using a Neural Network Sequence Labeling Approach. J Digit Imaging. 2019 Oct;32(5):685–692.
Journal cover image

Published In

J Digit Imaging

DOI

EISSN

1618-727X

Publication Date

October 2019

Volume

32

Issue

5

Start / End Page

685 / 692

Location

United States

Related Subject Headings

  • Research Report
  • Reproducibility of Results
  • Nuclear Medicine & Medical Imaging
  • Neural Networks, Computer
  • Mammography
  • Image Interpretation, Computer-Assisted
  • Humans
  • Female
  • Electronic Health Records
  • Databases, Factual