Estimating dataset size requirements for classifying DNA microarray data.
Publication
, Journal Article
Mukherjee, S; Tamayo, P; Rogers, S; Rifkin, R; Engle, A; Campbell, C; Golub, TR; Mesirov, JP
Published in: Journal of computational biology : a journal of computational molecular cell biology
January 2003
A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.
Altmetric Attention Stats
Dimensions Citation Stats
Published In
Journal of computational biology : a journal of computational molecular cell biology
DOI
EISSN
1557-8666
ISSN
1066-5277
Publication Date
January 2003
Volume
10
Issue
2
Start / End Page
119 / 142
Related Subject Headings
- Oligonucleotide Array Sequence Analysis
- Neoplasms
- Models, Molecular
- Humans
- Gene Expression Profiling
- Computer Simulation
- Computational Biology
- Bioinformatics
- Algorithms
- 49 Mathematical sciences
Citation
APA
Chicago
ICMJE
MLA
NLM
Mukherjee, S., Tamayo, P., Rogers, S., Rifkin, R., Engle, A., Campbell, C., … Mesirov, J. P. (2003). Estimating dataset size requirements for classifying DNA microarray data. Journal of Computational Biology : A Journal of Computational Molecular Cell Biology, 10(2), 119–142. https://doi.org/10.1089/106652703321825928
Mukherjee, Sayan, Pablo Tamayo, Simon Rogers, Ryan Rifkin, Anna Engle, Colin Campbell, Todd R. Golub, and Jill P. Mesirov. “Estimating dataset size requirements for classifying DNA microarray data.” Journal of Computational Biology : A Journal of Computational Molecular Cell Biology 10, no. 2 (January 2003): 119–42. https://doi.org/10.1089/106652703321825928.
Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, et al. Estimating dataset size requirements for classifying DNA microarray data. Journal of computational biology : a journal of computational molecular cell biology. 2003 Jan;10(2):119–42.
Mukherjee, Sayan, et al. “Estimating dataset size requirements for classifying DNA microarray data.” Journal of Computational Biology : A Journal of Computational Molecular Cell Biology, vol. 10, no. 2, Jan. 2003, pp. 119–42. Epmc, doi:10.1089/106652703321825928.
Mukherjee S, Tamayo P, Rogers S, Rifkin R, Engle A, Campbell C, Golub TR, Mesirov JP. Estimating dataset size requirements for classifying DNA microarray data. Journal of computational biology : a journal of computational molecular cell biology. 2003 Jan;10(2):119–142.
Published In
Journal of computational biology : a journal of computational molecular cell biology
DOI
EISSN
1557-8666
ISSN
1066-5277
Publication Date
January 2003
Volume
10
Issue
2
Start / End Page
119 / 142
Related Subject Headings
- Oligonucleotide Array Sequence Analysis
- Neoplasms
- Models, Molecular
- Humans
- Gene Expression Profiling
- Computer Simulation
- Computational Biology
- Bioinformatics
- Algorithms
- 49 Mathematical sciences