Cross-institutional evaluation of BI-RADS predictive model for mammographic diagnosis of breast cancer.
OBJECTIVE: Given a predictive model for identifying very likely benign breast lesions on the basis of Breast Imaging Reporting and Data System (BI-RADS) mammographic findings, this study evaluated the model's ability to generalize to a patient data set from a different institution. MATERIALS AND METHODS: The artificial neural network model underwent three trials: it was optimized over 500 biopsy-proven lesions from Duke University Medical Center or "Duke," evaluated on 1,000 similar cases from the University of Pennsylvania Health System or "Penn," and reoptimized for Penn. RESULTS: Trial A's Duke-only model yielded 98% sensitivity, 36% specificity, area index (A(z)) of 0.86, and partial A(z) of 0.51. The cross-institutional trial B yielded 96% sensitivity, 28% specificity, A(z) of 0.79, and partial A(z) of 0.28. The decreases were significant for both A(z) (p = 0.017) and partial A(z) (p < 0.001). In trial C, the model reoptimized for the Penn data yielded 96% sensitivity, 35% specificity, A(z) of 0.83, and partial A(z) of 0.32. There were no significant differences compared with trial B for specificity (p = 0.44) or partial A(z) (p = 0.46), suggesting that the Penn data were inherently more difficult to characterize. CONCLUSION: The BI-RADS lexicon facilitated the cross-institutional test of a breast cancer prediction model. The model generalized reasonably well, but there were significant performance decreases. The cross-institutional performance was encouraging because it was not significantly different from that of a reoptimized model using the second data set at high sensitivities. This study indicates the need for further work to collect more data and to improve the robustness of the model.
Lo, JY; Markey, MK; Baker, JA; Floyd, CE
Volume / Issue
Start / End Page
Pubmed Central ID
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)