Assessing operating characteristics of CAD algorithms in the absence of a gold standard.


Journal Article

PURPOSE: The authors examine potential bias when using a reference reader panel as "gold standard" for estimating operating characteristics of CAD algorithms for detecting lesions. As an alternative, the authors propose latent class analysis (LCA), which does not require an external gold standard to evaluate diagnostic accuracy. METHODS: A binomial model for multiple reader detections using different diagnostic protocols was constructed, assuming conditional independence of readings given true lesion status. Operating characteristics of all protocols were estimated by maximum likelihood LCA. Reader panel and LCA based estimates were compared using data simulated from the binomial model for a range of operating characteristics. LCA was applied to 36 thin section thoracic computed tomography data sets from the Lung Image Database Consortium (LIDC): Free search markings of four radiologists were compared to markings from four different CAD assisted radiologists. For real data, bootstrap-based resampling methods, which accommodate dependence in reader detections, are proposed to test of hypotheses of differences between detection protocols. RESULTS: In simulation studies, reader panel based sensitivity estimates had an average relative bias (ARB) of -23% to -27%, significantly higher (p-value < 0.0001) than LCA (ARB--2% to -6%). Specificity was well estimated by both reader panel (ARB -0.6% to -0.5%) and LCA (ARB 1.4%-0.5%). Among 1145 lesion candidates LIDC considered, LCA estimated sensitivity of reference readers (55%) was significantly lower (p-value 0.006) than CAD assisted readers' (68%). Average false positives per patient for reference readers (0.95) was not significantly lower (p-value 0.28) than CAD assisted readers' (1.27). CONCLUSIONS: Whereas a gold standard based on a consensus of readers may substantially bias sensitivity estimates, LCA may be a significantly more accurate and consistent means for evaluating diagnostic accuracy.

Full Text

Duke Authors

Cited Authors

  • Choudhury, KR; Paik, DS; Yi, CA; Napel, S; Roos, J; Rubin, GD

Published Date

  • April 2010

Published In

Volume / Issue

  • 37 / 4

Start / End Page

  • 1788 - 1795

PubMed ID

  • 20443501

Pubmed Central ID

  • 20443501

International Standard Serial Number (ISSN)

  • 0094-2405

Digital Object Identifier (DOI)

  • 10.1118/1.3352687


  • eng

Conference Location

  • United States