Skip to main content

Canonical correlation analysis for multi-omics: Application to cross-cohort analysis.

Publication ,  Journal Article
Jiang, M-Z; Aguet, F; Ardlie, K; Chen, J; Cornell, E; Cruz, D; Durda, P; Gabriel, SB; Gerszten, RE; Guo, X; Johnson, CW; Kasela, S; Lange, LA ...
Published in: PLoS Genet
May 2023

Integrative approaches that simultaneously model multi-omics data have gained increasing popularity because they provide holistic system biology views of multiple or all components in a biological system of interest. Canonical correlation analysis (CCA) is a correlation-based integrative method designed to extract latent features shared between multiple assays by finding the linear combinations of features-referred to as canonical variables (CVs)-within each assay that achieve maximal across-assay correlation. Although widely acknowledged as a powerful approach for multi-omics data, CCA has not been systematically applied to multi-omics data in large cohort studies, which has only recently become available. Here, we adapted sparse multiple CCA (SMCCA), a widely-used derivative of CCA, to proteomics and methylomics data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Jackson Heart Study (JHS). To tackle challenges encountered when applying SMCCA to MESA and JHS, our adaptations include the incorporation of the Gram-Schmidt (GS) algorithm with SMCCA to improve orthogonality among CVs, and the development of Sparse Supervised Multiple CCA (SSMCCA) to allow supervised integration analysis for more than two assays. Effective application of SMCCA to the two real datasets reveals important findings. Applying our SMCCA-GS to MESA and JHS, we identified strong associations between blood cell counts and protein abundance, suggesting that adjustment of blood cell composition should be considered in protein-based association studies. Importantly, CVs obtained from two independent cohorts also demonstrate transferability across the cohorts. For example, proteomic CVs learned from JHS, when transferred to MESA, explain similar amounts of blood cell count phenotypic variance in MESA, explaining 39.0% ~ 50.0% variation in JHS and 38.9% ~ 49.1% in MESA. Similar transferability was observed for other omics-CV-trait pairs. This suggests that biologically meaningful and cohort-agnostic variation is captured by CVs. We anticipate that applying our SMCCA-GS and SSMCCA on various cohorts would help identify cohort-agnostic biologically meaningful relationships between multi-omics data and phenotypic traits.

Duke Scholars

Published In

PLoS Genet

DOI

EISSN

1553-7404

Publication Date

May 2023

Volume

19

Issue

5

Start / End Page

e1010517

Location

United States

Related Subject Headings

  • Proteomics
  • Multiomics
  • Humans
  • Developmental Biology
  • Cohort Studies
  • Canonical Correlation Analysis
  • 3105 Genetics
  • 0604 Genetics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Jiang, M.-Z., Aguet, F., Ardlie, K., Chen, J., Cornell, E., Cruz, D., … NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Analysis Working Group. (2023). Canonical correlation analysis for multi-omics: Application to cross-cohort analysis. PLoS Genet, 19(5), e1010517. https://doi.org/10.1371/journal.pgen.1010517
Jiang, Min-Zhi, François Aguet, Kristin Ardlie, Jiawen Chen, Elaine Cornell, Dan Cruz, Peter Durda, et al. “Canonical correlation analysis for multi-omics: Application to cross-cohort analysis.PLoS Genet 19, no. 5 (May 2023): e1010517. https://doi.org/10.1371/journal.pgen.1010517.
Jiang M-Z, Aguet F, Ardlie K, Chen J, Cornell E, Cruz D, et al. Canonical correlation analysis for multi-omics: Application to cross-cohort analysis. PLoS Genet. 2023 May;19(5):e1010517.
Jiang, Min-Zhi, et al. “Canonical correlation analysis for multi-omics: Application to cross-cohort analysis.PLoS Genet, vol. 19, no. 5, May 2023, p. e1010517. Pubmed, doi:10.1371/journal.pgen.1010517.
Jiang M-Z, Aguet F, Ardlie K, Chen J, Cornell E, Cruz D, Durda P, Gabriel SB, Gerszten RE, Guo X, Johnson CW, Kasela S, Lange LA, Lappalainen T, Liu Y, Reiner AP, Smith J, Sofer T, Taylor KD, Tracy RP, VanDenBerg DJ, Wilson JG, Rich SS, Rotter JI, Love MI, Raffield LM, Li Y, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Analysis Working Group. Canonical correlation analysis for multi-omics: Application to cross-cohort analysis. PLoS Genet. 2023 May;19(5):e1010517.

Published In

PLoS Genet

DOI

EISSN

1553-7404

Publication Date

May 2023

Volume

19

Issue

5

Start / End Page

e1010517

Location

United States

Related Subject Headings

  • Proteomics
  • Multiomics
  • Humans
  • Developmental Biology
  • Cohort Studies
  • Canonical Correlation Analysis
  • 3105 Genetics
  • 0604 Genetics