An analytical method for multiclass molecular cancer classification


Journal Article

Modern cancer treatment relies upon microscopic tissue examination to classify tumors according to anatomical site of origin. This approach is effective but subjective and variable even among experienced clinicians and pathologists. Recently, DNA microarray-generated gene expression data has been used to build molecular cancer classifiers. Previous work from our group and others demonstrated methods for solving pairwise classification problems using such global gene expression patterns. However, classification across multiple primary tumor classes poses new methodological and computational challenges. In this paper we describe a computational methodology for multiclass prediction that combines class-specific (one vs. all) binary support vector machines. We apply this methodology to the diagnosis of multiple common adult malignancies using DNA microarray data from a collection of 198 tumor samples, spanning 14 of the most common tumor types. Overall classification accuracy is 78%, far exceeding the expected accuracy for random classification. In a large subset of the samples (80%), the algorithm attains 90% accuracy. The methodology described in this paper both demonstrates that accurate gene expression-based multiclass cancer diagnosis is possible and highlights some of the analytic challenges inherent in applying such strategies to biomedical research.

Full Text

Cited Authors

  • Rifkin, R; Mukherjee, S; Tamayo, P; Ramaswamy, S; Yeang, CH; Angelo, M; Reich, M; Poggio, T; Lander, ES; Golub, TR; Mesirov, JP

Published Date

  • January 1, 2003

Published In

Volume / Issue

  • 45 / 4

Start / End Page

  • 706 - 723

International Standard Serial Number (ISSN)

  • 0036-1445

Digital Object Identifier (DOI)

  • 10.1137/S0036144502411986

Citation Source

  • Scopus