Skip to main content
Journal cover image

Estimating dataset size requirements for classifying DNA microarray data.

Publication ,  Journal Article
Mukherjee, S; Tamayo, P; Rogers, S; Rifkin, R; Engle, A; Campbell, C; Golub, TR; Mesirov, JP
Published in: Journal of computational biology : a journal of computational molecular cell biology
January 2003

A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Journal of computational biology : a journal of computational molecular cell biology

DOI

EISSN

1557-8666

ISSN

1066-5277

Publication Date

January 2003

Volume

10

Issue

2

Start / End Page

119 / 142

Related Subject Headings

  • Oligonucleotide Array Sequence Analysis
  • Neoplasms
  • Models, Molecular
  • Humans
  • Gene Expression Profiling
  • Computer Simulation
  • Computational Biology
  • Bioinformatics
  • Algorithms
  • 49 Mathematical sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Mukherjee, S., Tamayo, P., Rogers, S., Rifkin, R., Engle, A., Campbell, C., … Mesirov, J. P. (2003). Estimating dataset size requirements for classifying DNA microarray data. Journal of Computational Biology : A Journal of Computational Molecular Cell Biology, 10(2), 119–142. https://doi.org/10.1089/106652703321825928
Mukherjee, Sayan, Pablo Tamayo, Simon Rogers, Ryan Rifkin, Anna Engle, Colin Campbell, Todd R. Golub, and Jill P. Mesirov. “Estimating dataset size requirements for classifying DNA microarray data.Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 10, no. 2 (January 2003): 119–42. https://doi.org/10.1089/106652703321825928.
Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, et al. Estimating dataset size requirements for classifying DNA microarray data. Journal of computational biology : a journal of computational molecular cell biology. 2003 Jan;10(2):119–42.
Mukherjee, Sayan, et al. “Estimating dataset size requirements for classifying DNA microarray data.Journal of Computational Biology : A Journal of Computational Molecular Cell Biology, vol. 10, no. 2, Jan. 2003, pp. 119–42. Epmc, doi:10.1089/106652703321825928.
Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, Golub TR, Mesirov JP. Estimating dataset size requirements for classifying DNA microarray data. Journal of computational biology : a journal of computational molecular cell biology. 2003 Jan;10(2):119–142.
Journal cover image

Published In

Journal of computational biology : a journal of computational molecular cell biology

DOI

EISSN

1557-8666

ISSN

1066-5277

Publication Date

January 2003

Volume

10

Issue

2

Start / End Page

119 / 142

Related Subject Headings

  • Oligonucleotide Array Sequence Analysis
  • Neoplasms
  • Models, Molecular
  • Humans
  • Gene Expression Profiling
  • Computer Simulation
  • Computational Biology
  • Bioinformatics
  • Algorithms
  • 49 Mathematical sciences