A permutation procedure to correct for confounders in case-control studies, including tests of rare variation

Journal Article

Many case-control tests of rare variation are implemented in statistical frameworks that make correction for confounders like population stratification difficult. Simple permutation of disease status is unacceptable for resolving this issue because the replicate data sets do not have the same confounding as the original data set. These limitations make it difficult to apply rare-variant tests to samples in which confounding most likely exists, e.g., samples collected from admixed populations. To enable the use of such rare-variant methods in structured samples, as well as to facilitate permutation tests for any situation in which case-control tests require adjustment for confounding covariates, we propose to establish the significance of a rare-variant test via a modified permutation procedure. Our procedure uses Fisher's noncentral hypergeometric distribution to generate permuted data sets with the same structure present in the actual data set such that inference is valid in the presence of confounding factors. We use simulated sequence data based on coalescent models to show that our permutation strategy corrects for confounding due to population stratification that, if ignored, would otherwise inflate the size of a rare-variant test. We further illustrate the approach by using sequence data from the Dallas Heart Study of energy metabolism traits. Researchers can implement our permutation approach by using the R package BiasedUrn. © 2012 The American Society of Human Genetics.

Full Text

Duke Authors

Cited Authors

  • Epstein, MP; Duncan, R; Jiang, Y; Conneely, KN; Allen, AS; Satten, GA

Published Date

  • 2012

Published In

Volume / Issue

  • 91 / 2

Start / End Page

  • 215 - 223

PubMed ID

  • 22818855

Pubmed Central ID

  • 22818855

International Standard Serial Number (ISSN)

  • 0002-9297

Digital Object Identifier (DOI)

  • 10.1016/j.ajhg.2012.06.004