Expression-based Pathway Signature Analysis (EPSA): mining publicly available microarray data for insight into human disease.
BACKGROUND: Publicly available data repositories facilitate the sharing of an ever-increasing amount of microarray data. However, these datasets remain highly underutilized. Reutilizing the data could offer insights into questions and diseases entirely distinct from those considered in the original experimental design. METHODS: We first analyzed microarray datasets derived from known perturbations of specific pathways using the samr package in R to identify specific patterns of change in gene expression. We refer to these pattern of gene expression alteration as a "pathway signatures." We then used Spearman's rank correlation coefficient, a non-parametric measure of correlation, to determine similarities between pathway signatures and disease profiles, and permutation analysis to evaluate false discovery rate. This enabled detection of statistically significant similarity between these pathway signatures and corresponding changes observed in human disease. Finally, we evaluated pathway activation, as indicated by correlation with the pathway signature, as a risk factor for poor prognosis using multiple unrelated, publicly available datasets. RESULTS: We have developed a novel method, Expression-based Pathway Signature Analysis (EPSA). We demonstrate that ESPA is a rigorous computational approach for statistically evaluating the degree of similarity between highly disparate sources of microarray expression data. We also show how EPSA can be used in a number of cases to stratify patients with differential disease prognosis. EPSA can be applied to many different types of datasets in spite of different platforms, different experimental designs, and different species. Applying this method can yield new insights into human disease progression. CONCLUSION: EPSA enables the use of publicly available data for an entirely new, translational purpose to enable the identification of potential pathways of dysregulation in human disease, as well as potential leads for therapeutic molecular targets.
Tenenbaum, JD; Walker, MG; Utz, PJ; Butte, AJ
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)