Skip to main content
Journal cover image

Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.

Publication ,  Journal Article
Jorge, A; Castro, VM; Barnado, A; Gainer, V; Hong, C; Cai, T; Cai, T; Carroll, R; Denny, JC; Crofford, L; Costenbader, KH; Liao, KP ...
Published in: Semin Arthritis Rheum
August 2019

OBJECTIVE: To utilize electronic health records (EHRs) to study SLE, algorithms are needed to accurately identify these patients. We used machine learning to generate data-driven SLE EHR algorithms and assessed performance of existing rule-based algorithms. METHODS: We randomly selected subjects with ≥ 1 SLE ICD-9/10 codes from our EHR and identified gold standard definite and probable SLE cases by chart review, based on 1997 ACR or 2012 SLICC Classification Criteria. From a training set, we extracted coded and narrative concepts using natural language processing and generated algorithms using penalized logistic regression to classify definite or definite/probable SLE. We assessed predictive characteristics in internal and external cohort validations. We also tested performance characteristics of published rule-based algorithms with pre-specified permutations of ICD-9 codes, laboratory tests and medications in our EHR. RESULTS: At a specificity of 97%, our machine learning coded algorithm for definite SLE had 90% positive predictive value (PPV) and 64% sensitivity and for definite/probable SLE, 92% PPV and 47% sensitivity. In the external validation, at 97% specificity, the definite/probable algorithm had 94% PPV and 60% sensitivity. Adding NLP concepts did not improve performance metrics. The PPVs of published rule-based algorithms ranged from 45-79% in our EHR. CONCLUSION: Our machine learning SLE algorithms performed well in internal and external validation. Rule-based SLE algorithms did not transport as well to our EHR. Unique EHR characteristics, clinical practices and research goals regarding the desired sensitivity and specificity of the case definition must be considered when applying algorithms to identify SLE patients.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Semin Arthritis Rheum

DOI

EISSN

1532-866X

Publication Date

August 2019

Volume

49

Issue

1

Start / End Page

84 / 90

Location

United States

Related Subject Headings

  • Sensitivity and Specificity
  • Natural Language Processing
  • Middle Aged
  • Male
  • Machine Learning
  • Lupus Erythematosus, Systemic
  • Humans
  • Female
  • Electronic Health Records
  • Databases, Factual
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Jorge, A., Castro, V. M., Barnado, A., Gainer, V., Hong, C., Cai, T., … Feldman, C. H. (2019). Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. Semin Arthritis Rheum, 49(1), 84–90. https://doi.org/10.1016/j.semarthrit.2019.01.002
Jorge, April, Victor M. Castro, April Barnado, Vivian Gainer, Chuan Hong, Tianxi Cai, Tianrun Cai, et al. “Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.Semin Arthritis Rheum 49, no. 1 (August 2019): 84–90. https://doi.org/10.1016/j.semarthrit.2019.01.002.
Jorge A, Castro VM, Barnado A, Gainer V, Hong C, Cai T, et al. Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. Semin Arthritis Rheum. 2019 Aug;49(1):84–90.
Jorge, April, et al. “Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms.Semin Arthritis Rheum, vol. 49, no. 1, Aug. 2019, pp. 84–90. Pubmed, doi:10.1016/j.semarthrit.2019.01.002.
Jorge A, Castro VM, Barnado A, Gainer V, Hong C, Cai T, Carroll R, Denny JC, Crofford L, Costenbader KH, Liao KP, Karlson EW, Feldman CH. Identifying lupus patients in electronic health records: Development and validation of machine learning algorithms and application of rule-based algorithms. Semin Arthritis Rheum. 2019 Aug;49(1):84–90.
Journal cover image

Published In

Semin Arthritis Rheum

DOI

EISSN

1532-866X

Publication Date

August 2019

Volume

49

Issue

1

Start / End Page

84 / 90

Location

United States

Related Subject Headings

  • Sensitivity and Specificity
  • Natural Language Processing
  • Middle Aged
  • Male
  • Machine Learning
  • Lupus Erythematosus, Systemic
  • Humans
  • Female
  • Electronic Health Records
  • Databases, Factual