Cluster analysis of BI-RADS™ descriptions of biopsy-proven breast lesions


Journal Article

The purpose of this study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database. Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building. Agglomerative hierarchical clustering and k-means clustering were used to identify clusters in a large, heterogeneous computer-aided diagnosis database based on mammographic findings (BI-RADS™) and patient age. The clusters were examined in terms of their feature distributions. The clusters showed logical separation of distinct clinical subtypes such as architectural distortions, masses, and calcifications. Moreover, the common subtypes of masses and calcifications were stratified into clusters based on age groupings. The percent of the cases that were malignant was notably different among the clusters. Cluster analysis can provide a powerful tool in discerning the subgroups present in a large, heterogeneous computer-aided diagnosis database.

Full Text

Duke Authors

Cited Authors

  • Markey, MK; Lo, JY; Tourassi, GD; Floyd, CE

Published Date

  • January 1, 2002

Published In

Volume / Issue

  • 4684 I /

Start / End Page

  • 363 - 370

International Standard Serial Number (ISSN)

  • 0277-786X

Digital Object Identifier (DOI)

  • 10.1117/12.467177

Citation Source

  • Scopus