Skip to main content
Journal cover image

Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.

Publication ,  Journal Article
Zhao, SS; Hong, C; Cai, T; Xu, C; Huang, J; Ermann, J; Goodson, NJ; Solomon, DH; Cai, T; Liao, KP
Published in: Rheumatology (Oxford)
May 1, 2020

OBJECTIVES: To develop classification algorithms that accurately identify axial SpA (axSpA) patients in electronic health records, and compare the performance of algorithms incorporating free-text data against approaches using only International Classification of Diseases (ICD) codes. METHODS: An enriched cohort of 7853 eligible patients was created from electronic health records of two large hospitals using automated searches (⩾1 ICD codes combined with simple text searches). Key disease concepts from free-text data were extracted using NLP and combined with ICD codes to develop algorithms. We created both supervised regression-based algorithms-on a training set of 127 axSpA cases and 423 non-cases-and unsupervised algorithms to identify patients with high probability of having axSpA from the enriched cohort. Their performance was compared against classifications using ICD codes only. RESULTS: NLP extracted four disease concepts of high predictive value: ankylosing spondylitis, sacroiliitis, HLA-B27 and spondylitis. The unsupervised algorithm, incorporating both the NLP concept and ICD code for AS, identified the greatest number of patients. By setting the probability threshold to attain 80% positive predictive value, it identified 1509 axSpA patients (mean age 53 years, 71% male). Sensitivity was 0.78, specificity 0.94 and area under the curve 0.93. The two supervised algorithms performed similarly but identified fewer patients. All three outperformed traditional approaches using ICD codes alone (area under the curve 0.80-0.87). CONCLUSION: Algorithms incorporating free-text data can accurately identify axSpA patients in electronic health records. Large cohorts identified using these novel methods offer exciting opportunities for future clinical research.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Rheumatology (Oxford)

DOI

EISSN

1462-0332

Publication Date

May 1, 2020

Volume

59

Issue

5

Start / End Page

1059 / 1065

Location

England

Related Subject Headings

  • Spondylitis, Ankylosing
  • Spondylarthritis
  • Sensitivity and Specificity
  • Quality Improvement
  • Natural Language Processing
  • Middle Aged
  • Male
  • International Classification of Diseases
  • Humans
  • Female
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Zhao, S. S., Hong, C., Cai, T., Xu, C., Huang, J., Ermann, J., … Liao, K. P. (2020). Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records. Rheumatology (Oxford), 59(5), 1059–1065. https://doi.org/10.1093/rheumatology/kez375
Zhao, Sizheng Steven, Chuan Hong, Tianrun Cai, Chang Xu, Jie Huang, Joerg Ermann, Nicola J. Goodson, Daniel H. Solomon, Tianxi Cai, and Katherine P. Liao. “Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.Rheumatology (Oxford) 59, no. 5 (May 1, 2020): 1059–65. https://doi.org/10.1093/rheumatology/kez375.
Zhao SS, Hong C, Cai T, Xu C, Huang J, Ermann J, et al. Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records. Rheumatology (Oxford). 2020 May 1;59(5):1059–65.
Zhao, Sizheng Steven, et al. “Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.Rheumatology (Oxford), vol. 59, no. 5, May 2020, pp. 1059–65. Pubmed, doi:10.1093/rheumatology/kez375.
Zhao SS, Hong C, Cai T, Xu C, Huang J, Ermann J, Goodson NJ, Solomon DH, Liao KP. Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records. Rheumatology (Oxford). 2020 May 1;59(5):1059–1065.
Journal cover image

Published In

Rheumatology (Oxford)

DOI

EISSN

1462-0332

Publication Date

May 1, 2020

Volume

59

Issue

5

Start / End Page

1059 / 1065

Location

England

Related Subject Headings

  • Spondylitis, Ankylosing
  • Spondylarthritis
  • Sensitivity and Specificity
  • Quality Improvement
  • Natural Language Processing
  • Middle Aged
  • Male
  • International Classification of Diseases
  • Humans
  • Female