The impact of disregarding family structure on genome-wide association analysis of complex diseases in cohorts with simple pedigrees.

Journal Article (Journal Article)

The generalized linear mixed models (GLMMs) methodology is the standard framework for genome-wide association studies (GWAS) of complex diseases in family-based cohorts. Fitting GLMMs in very large cohorts, however, can be computationally demanding. Also, the modified versions of GLMM using faster algorithms may underperform, for instance when a single nucleotide polymorphism (SNP) is correlated with fixed-effects covariates. We investigated the extent to which disregarding family structure may compromise GWAS in cohorts with simple pedigrees by contrasting logistic regression models (i.e., with no family structure) to three LMMs-based ones. Our analyses showed that the logistic regression models in general resulted in smaller P values compared with the LMMs-based models; however, the differences in P values were mostly minor. Disregarding family structure had little impact on determining disease-associated SNPs at genome-wide level of significance (i.e., P < 5E-08) as the four P values resulted from the tested methods for any SNP were all below or all above 5E-08. Nevertheless, larger discrepancies were detected between logistic regression and LMMs-based models at suggestive level of significance (i.e., of 5E-08 ≤ P < 5E-06). The SNP effects estimated by the logistic regression models were not statistically different from those estimated by GLMMs that implemented Wald's test. However, several SNP effects were significantly different from their counterparts in LMMs analyses. We suggest that fitting GLMMs with Wald's test on a pre-selected subset of SNPs obtained from logistic regression models can ensure the balance between the speed of analyses and the accuracy of parameters.

Full Text

Duke Authors

Cited Authors

  • Nazarian, A; Arbeev, KG; Kulminski, AM

Published Date

  • February 2020

Published In

Volume / Issue

  • 61 / 1

Start / End Page

  • 75 - 86

PubMed ID

  • 31755004

Pubmed Central ID

  • PMC6980752

Electronic International Standard Serial Number (EISSN)

  • 2190-3883

International Standard Serial Number (ISSN)

  • 1234-1983

Digital Object Identifier (DOI)

  • 10.1007/s13353-019-00526-7


  • eng