Sorting multiple classes in multi-dimensional ROC analysis: parametric and nonparametric approaches.
In large-scale data analysis, such as in a microarray study to identify the most differentially expressed genes, diagnostic tests are frequently used to classify and predict subjects into their different categories. Frequently, these categories do not have an intrinsic natural order even though the quantitative test results have a relative order. As identifying the correct order for a proper definition of accuracy measures is important for a high-dimensional receiver operating characteristic (ROC) analysis, we propose rigorous and automated approaches to sort out the multiple categories using simple summary statistics such as means and relative effects. We discuss the hypervolume under the ROC manifold (HUM), its dependence on the order of the test results and the minimum acceptable HUM values in a general multi-category classification problem. Using a leukemia data set and a liver cancer data set, we show how our approaches provide accurate screening results when we have a large number of tests.
Duke Scholars
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Transcriptome
- Toxicology
- Statistics, Nonparametric
- ROC Curve
- Models, Statistical
- Liver Neoplasms
- Leukemia
- Humans
- Gene Expression Profiling
- Data Interpretation, Statistical
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Transcriptome
- Toxicology
- Statistics, Nonparametric
- ROC Curve
- Models, Statistical
- Liver Neoplasms
- Leukemia
- Humans
- Gene Expression Profiling
- Data Interpretation, Statistical