Magellan: a web based system for the integrated analysis of heterogeneous biological data and annotations; application to DNA copy number and expression data in ovarian cancer.

Published online

Journal Article

Recent advances in high throughput biological methods allow researchers to generate enormous amounts of data from a single experiment. In order to extract meaningful conclusions from this tidal wave of data, it will be necessary to develop analytical methods of sufficient power and utility. It is particularly important that biologists themselves be able to perform many of these analyses, such that their background knowledge of the experimental system under study can be used to interpret results and direct further inquiries. We have developed a web-based system, Magellan, which allows the upload, storage, and analysis of multivariate data and textual or numerical annotations. Data and annotations are treated as abstract entities, to maximize the different types of information the system can store and analyze. Annotations can be used in analyses/visualizations, as a means of subsetting data to reduce dimensionality, or as a means of projecting variables from one data type or data set to another. Analytical methods are deployed within Magellan such that new functionalities can be added in a straightforward fashion. Using Magellan, we performed an integrated analysis of genome-wide comparative genomic hybridization (CGH), mRNA expression, and clinical data from ovarian tumors. Analyses included the use of permutation-based methods to identify genes whose mRNA expression levels correlated with patient survival, a nearest neighbor classifier to predict patient survival from CGH data, and curated annotations such as genomic position and derived annotations such as statistical computations to explore the quantitative relationship between CGH and mRNA expression data.

Full Text

Duke Authors

Cited Authors

  • Kingsley, CB; Kuo, W-L; Polikoff, D; Berchuck, A; Gray, JW; Jain, AN

Published Date

  • February 5, 2007

Published In

Volume / Issue

  • 2 /

Start / End Page

  • 10 - 21

PubMed ID

  • 19458754

Pubmed Central ID

  • 19458754

Electronic International Standard Serial Number (EISSN)

  • 1176-9351


  • eng

Conference Location

  • United States