A simple and improved correction for population stratification in case-control studies.

Journal Article (Journal Article)

Population stratification remains an important issue in case-control studies of disease-marker association, even within populations considered to be genetically homogeneous. Campbell et al. (Nature Genetics 2005;37:868-872) illustrated this by showing that stratification induced a spurious association between the lactase gene (LCT) and tall/short status in a European American sample. Furthermore, existing approaches for controlling stratification by use of substructure-informative loci (e.g., genomic control, structured association, and principal components) could not resolve this confounding. To address this problem, we propose a simple two-step procedure. In the first step, we model the odds of disease, given data on substructure-informative loci (excluding the test locus). For each participant, we use this model to calculate a stratification score, which is that participant's estimated odds of disease calculated using his or her substructure-informative-loci data in the disease-odds model. In the second step, we assign subjects to strata defined by stratification score and then test for association between the disease and the test locus within these strata. The resulting association test is valid even in the presence of population stratification. Our approach is computationally simple and less model dependent than are existing approaches for controlling stratification. To illustrate these properties, we apply our approach to the data from Campbell et al. and find no association between the LCT locus and tall/short status. Using simulated data, we show that our approach yields a more appropriate correction for stratification than does principal components or genomic control.

Full Text

Duke Authors

Cited Authors

  • Epstein, MP; Allen, AS; Satten, GA

Published Date

  • May 2007

Published In

Volume / Issue

  • 80 / 5

Start / End Page

  • 921 - 930

PubMed ID

  • 17436246

Pubmed Central ID

  • PMC1852732

Electronic International Standard Serial Number (EISSN)

  • 1537-6605

International Standard Serial Number (ISSN)

  • 0002-9297

Digital Object Identifier (DOI)

  • 10.1086/516842


  • eng