Predicting false negative errors in digital breast tomosynthesis among radiology trainees using a computer vision-based approach
© 2016 Elsevier Ltd. All rights reserved. Purpose Digital breast tomosynthesis (DBT) can improve lesion visibility in comparison to mammography by eliminating breast tissue superimposition. While the benefits of DBT in breast cancer screening rely on well trained radiologists, the optimal training regimen in DBT is unknown. We propose a computer-aided educational system that individually selects the optimal training cases for each trainee. The first step towards this goal is to capture the individual weaknesses of each trainee. In this study, we present and evaluate a computer algorithm for this purpose with particular focus on false negative errors. Methods We developed an algorithm (a user model) that predicted the likelihood of a trainee missing an abnormal location. An individual model is applied for each trainee. The algorithm consists of three steps. First, the lesions on DBT images are segmented by a 3D active contour method with a level set algorithm. Then, 16 features are extracted automatically for the segmented lesions. Finally a multivariate logistic regression classifier predicts the likelihood of error based on the extracted features. The classifier is trained using the previous interpretation data of the trainee. We evaluated the individual predictive algorithms experimentally using data from a reader study in which 29 trainees and 3 expert breast radiologists read 60 DBT cases. Receiver operating characteristic (ROC) analysis, along with a repeated holdout approach, was used to evaluate the predictive performance of our algorithm. Results The average area under the ROC curve (AUC) of the algorithms which predicted which lesions will be detected and which will be missed by a specific trainee was 0.627 (95% CI: 0.579-0.675). The average performance was statistically significantly better than chance (p<0.001). Under the status quo, training involves no specific strategy for case presentation, and this random behavior corresponds to AUC of 0.5. Therefore, the proposed algorithm may provide a significant improvement in distinguishing abnormal locations that will be detected by a trainee from those that will be missed. Conclusions Our algorithm was able to distinguish abnormal locations that will be detected by a trainee from those that will be missed. This could be used to enrich the training set with cases that are likely to prompt error for the individual trainee while still maintaining a range of cases necessary for comprehensive education.
Wang, M; Zhang, J; Grimm, LJ; Ghate, SV; Walsh, R; Johnson, KS; Lo, JY; Mazurowski, MA
Volume / Issue
Start / End Page
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)