Skip to main content
Journal cover image

Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping.

Publication ,  Journal Article
Link, NB; Huang, S; Cai, T; Sun, J; Dahal, K; Costa, L; Cho, K; Liao, K; Cai, T; Hong, C; Million Veteran Program,
Published in: Int J Med Inform
April 1, 2022

OBJECTIVE: The use of electronic health records (EHR) systems has grown over the past decade, and with it, the need to extract information from unstructured clinical narratives. Clinical notes, however, frequently contain acronyms with several potential senses (meanings) and traditional natural language processing (NLP) techniques cannot differentiate between these senses. In this study we introduce a semi-supervised method for binary acronym disambiguation, the task of classifying a target sense for acronyms in the clinical EHR notes. METHODS: We developed a semi-supervised ensemble machine learning (CASEml) algorithm to automatically identify when an acronym means a target sense by leveraging semantic embeddings, visit-level text and billing information. The algorithm was validated using note data from the Veterans Affairs hospital system to classify the meaning of three acronyms: RA, MS, and MI. We compared the performance of CASEml against another standard semi-supervised method and a baseline metric selecting the most frequent acronym sense. Along with evaluating the performance of these methods for specific instances of acronyms, we evaluated the impact of acronym disambiguation on NLP-driven phenotyping of rheumatoid arthritis. RESULTS: CASEml achieved accuracies of 0.947, 0.911, and 0.706 for RA, MS, and MI, respectively, higher than a standard baseline metric and (on average) higher than a state-of-the-art semi-supervised method. As well, we demonstrated that applying CASEml to medical notes improves the AUC of a phenotype algorithm for rheumatoid arthritis. CONCLUSION: CASEml is a novel method that accurately disambiguates acronyms in clinical notes and has advantages over commonly used supervised and semi-supervised machine learning approaches. In addition, CASEml improves the performance of NLP tasks that rely on ambiguous acronyms, such as phenotyping.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Int J Med Inform

DOI

EISSN

1872-8243

Publication Date

April 1, 2022

Volume

162

Start / End Page

104753

Location

Ireland

Related Subject Headings

  • Medical Informatics
  • 46 Information and computing sciences
  • 42 Health sciences
  • 32 Biomedical and clinical sciences
  • 11 Medical and Health Sciences
  • 09 Engineering
  • 08 Information and Computing Sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Link, N. B., Huang, S., Cai, T., Sun, J., Dahal, K., Costa, L., … Million Veteran Program, . (2022). Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping. Int J Med Inform, 162, 104753. https://doi.org/10.1016/j.ijmedinf.2022.104753
Link, Nicholas B., Sicong Huang, Tianrun Cai, Jiehuan Sun, Kumar Dahal, Lauren Costa, Kelly Cho, et al. “Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping.Int J Med Inform 162 (April 1, 2022): 104753. https://doi.org/10.1016/j.ijmedinf.2022.104753.
Link NB, Huang S, Cai T, Sun J, Dahal K, Costa L, et al. Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping. Int J Med Inform. 2022 Apr 1;162:104753.
Link, Nicholas B., et al. “Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping.Int J Med Inform, vol. 162, Apr. 2022, p. 104753. Pubmed, doi:10.1016/j.ijmedinf.2022.104753.
Link NB, Huang S, Cai T, Sun J, Dahal K, Costa L, Cho K, Liao K, Hong C, Million Veteran Program. Binary acronym disambiguation in clinical notes from electronic health records with an application in computational phenotyping. Int J Med Inform. 2022 Apr 1;162:104753.
Journal cover image

Published In

Int J Med Inform

DOI

EISSN

1872-8243

Publication Date

April 1, 2022

Volume

162

Start / End Page

104753

Location

Ireland

Related Subject Headings

  • Medical Informatics
  • 46 Information and computing sciences
  • 42 Health sciences
  • 32 Biomedical and clinical sciences
  • 11 Medical and Health Sciences
  • 09 Engineering
  • 08 Information and Computing Sciences