Penalized multimarker vs. single-marker regression methods for genome-wide association studies of quantitative traits.

Published

Journal Article

The data from genome-wide association studies (GWAS) in humans are still predominantly analyzed using single-marker association methods. As an alternative to single-marker analysis (SMA), all or subsets of markers can be tested simultaneously. This approach requires a form of penalized regression (PR) as the number of SNPs is much larger than the sample size. Here we review PR methods in the context of GWAS, extend them to perform penalty parameter and SNP selection by false discovery rate (FDR) control, and assess their performance in comparison with SMA. PR methods were compared with SMA, using realistically simulated GWAS data with a continuous phenotype and real data. Based on these comparisons our analytic FDR criterion may currently be the best approach to SNP selection using PR for GWAS. We found that PR with FDR control provides substantially more power than SMA with genome-wide type-I error control but somewhat less power than SMA with Benjamini-Hochberg FDR control (SMA-BH). PR with FDR-based penalty parameter selection controlled the FDR somewhat conservatively while SMA-BH may not achieve FDR control in all situations. Differences among PR methods seem quite small when the focus is on SNP selection with FDR control. Incorporating linkage disequilibrium into the penalization by adapting penalties developed for covariates measured on graphs can improve power but also generate more false positives or wider regions for follow-up. We recommend the elastic net with a mixing weight for the Lasso penalty near 0.5 as the best method.

Full Text

Duke Authors

Cited Authors

  • Yi, H; Breheny, P; Imam, N; Liu, Y; Hoeschele, I

Published Date

  • January 2015

Published In

Volume / Issue

  • 199 / 1

Start / End Page

  • 205 - 222

PubMed ID

  • 25354699

Pubmed Central ID

  • 25354699

Electronic International Standard Serial Number (EISSN)

  • 1943-2631

Digital Object Identifier (DOI)

  • 10.1534/genetics.114.167817

Language

  • eng

Conference Location

  • United States