Skip to main content

Pathway analysis using random forests classification and regression.

Publication ,  Journal Article
Pang, H; Lin, A; Holford, M; Enerson, BE; Lu, B; Lawton, MP; Floyd, E; Zhao, H
Published in: Bioinformatics
August 15, 2006

MOTIVATION: Although numerous methods have been developed to better capture biological information from microarray data, commonly used single gene-based methods neglect interactions among genes and leave room for other novel approaches. For example, most classification and regression methods for microarray data are based on the whole set of genes and have not made use of pathway information. Pathway-based analysis in microarray studies may lead to more informative and relevant knowledge for biological researchers. RESULTS: In this paper, we describe a pathway-based classification and regression method using Random Forests to analyze gene expression data. The proposed methods allow researchers to rank important pathways from externally available databases, discover important genes, find pathway-based outlying cases and make full use of a continuous outcome variable in the regression setting. We also compared Random Forests with other machine learning methods using several datasets and found that Random Forests classification error rates were either the lowest or the second-lowest. By combining pathway information and novel statistical methods, this procedure represents a promising computational strategy in dissecting pathways and can provide biological insight into the study of microarray data. AVAILABILITY: Source code written in R is available from http://bioinformatics.med.yale.edu/pathway-analysis/rf.htm.

Duke Scholars

Published In

Bioinformatics

DOI

EISSN

1367-4811

Publication Date

August 15, 2006

Volume

22

Issue

16

Start / End Page

2028 / 2036

Location

England

Related Subject Headings

  • Software
  • Regression Analysis
  • Rats
  • Pattern Recognition, Automated
  • Oligonucleotide Array Sequence Analysis
  • Models, Statistical
  • Models, Biological
  • Mice
  • Humans
  • Gene Expression Profiling
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pang, H., Lin, A., Holford, M., Enerson, B. E., Lu, B., Lawton, M. P., … Zhao, H. (2006). Pathway analysis using random forests classification and regression. Bioinformatics, 22(16), 2028–2036. https://doi.org/10.1093/bioinformatics/btl344
Pang, Herbert, Aiping Lin, Matthew Holford, Bradley E. Enerson, Bin Lu, Michael P. Lawton, Eugenia Floyd, and Hongyu Zhao. “Pathway analysis using random forests classification and regression.Bioinformatics 22, no. 16 (August 15, 2006): 2028–36. https://doi.org/10.1093/bioinformatics/btl344.
Pang H, Lin A, Holford M, Enerson BE, Lu B, Lawton MP, et al. Pathway analysis using random forests classification and regression. Bioinformatics. 2006 Aug 15;22(16):2028–36.
Pang, Herbert, et al. “Pathway analysis using random forests classification and regression.Bioinformatics, vol. 22, no. 16, Aug. 2006, pp. 2028–36. Pubmed, doi:10.1093/bioinformatics/btl344.
Pang H, Lin A, Holford M, Enerson BE, Lu B, Lawton MP, Floyd E, Zhao H. Pathway analysis using random forests classification and regression. Bioinformatics. 2006 Aug 15;22(16):2028–2036.

Published In

Bioinformatics

DOI

EISSN

1367-4811

Publication Date

August 15, 2006

Volume

22

Issue

16

Start / End Page

2028 / 2036

Location

England

Related Subject Headings

  • Software
  • Regression Analysis
  • Rats
  • Pattern Recognition, Automated
  • Oligonucleotide Array Sequence Analysis
  • Models, Statistical
  • Models, Biological
  • Mice
  • Humans
  • Gene Expression Profiling