Model-Based Algorithms for Detecting Peripheral Artery Disease Using Administrative Data From an Electronic Health Record Data System: Algorithm Development Study.

Journal Article (Journal Article)

BACKGROUND: Peripheral artery disease (PAD) affects 8 to 10 million Americans, who face significantly elevated risks of both mortality and major limb events such as amputation. Unfortunately, PAD is relatively underdiagnosed, undertreated, and underresearched, leading to wide variations in treatment patterns and outcomes. Efforts to improve PAD care and outcomes have been hampered by persistent difficulties identifying patients with PAD for clinical and investigatory purposes. OBJECTIVE: The aim of this study is to develop and validate a model-based algorithm to detect patients with peripheral artery disease (PAD) using data from an electronic health record (EHR) system. METHODS: An initial query of the EHR in a large health system identified all patients with PAD-related diagnosis codes for any encounter during the study period. Clinical adjudication of PAD diagnosis was performed by chart review on a random subgroup. A binary logistic regression to predict PAD was built and validated using a least absolute shrinkage and selection operator (LASSO) approach in the adjudicated patients. The algorithm was then applied to the nonsampled records to further evaluate its performance. RESULTS: The initial EHR data query using 406 diagnostic codes yielded 15,406 patients. Overall, 2500 patients were randomly selected for ground truth PAD status adjudication. In the end, 108 code flags remained after removing rarely- and never-used codes. We entered these code flags plus administrative encounter, imaging, procedure, and specialist flags into a LASSO model. The area under the curve for this model was 0.862. CONCLUSIONS: The algorithm we constructed has two main advantages over other approaches to the identification of patients with PAD. First, it was derived from a broad population of patients with many different PAD manifestations and treatment pathways across a large health system. Second, our model does not rely on clinical notes and can be applied in situations in which only administrative billing data (eg, large administrative data sets) are available. A combination of diagnosis codes and administrative flags can accurately identify patients with PAD in large cohorts.

Full Text

Duke Authors

Cited Authors

  • Weissler, EH; Lippmann, SJ; Smerek, MM; Ward, RA; Kansal, A; Brock, A; Sullivan, RC; Long, C; Patel, MR; Greiner, MA; Hardy, NC; Curtis, LH; Jones, WS

Published Date

  • August 19, 2020

Published In

Volume / Issue

  • 8 / 8

Start / End Page

  • e18542 -

PubMed ID

  • 32663152

Pubmed Central ID

  • PMC7468640

International Standard Serial Number (ISSN)

  • 2291-9694

Digital Object Identifier (DOI)

  • 10.2196/18542


  • eng

Conference Location

  • Canada