Stratification-score matching improves correction for confounding by population stratification in case-control association studies.

Journal Article (Journal Article)

Proper control of confounding due to population stratification is crucial for valid analysis of case-control association studies. Fine matching of cases and controls based on genetic ancestry is an increasingly popular strategy to correct for such confounding, both in genome-wide association studies (GWASs) as well as studies that employ next-generation sequencing, where matching can be used when selecting a subset of participants from a GWAS for rare-variant analysis. Existing matching methods match on measures of genetic ancestry that combine multiple components of ancestry into a scalar quantity. However, we show that including nonconfounding ancestry components in a matching criterion can lead to inaccurate matches, and hence to an improper control of confounding. To resolve this issue, we propose a novel method that assigns cases and controls to matched strata based on the stratification score (Epstein et al. [2007] Am J Hum Genet 80:921-930), which is the probability of disease given genomic variables. Matching on the stratification score leads to more accurate matches because case participants are matched to control participants who have a similar risk of disease given ancestry information. We illustrate our matching method using the African-American arm of the GAIN GWAS of schizophrenia. In this study, we observe that confounding due to stratification can be resolved by our matching approach but not by other existing matching procedures. We also use simulated data to show our novel matching approach can provide a more appropriate correction for population stratification than existing matching approaches.

Full Text

Duke Authors

Cited Authors

  • Epstein, MP; Duncan, R; Broadaway, KA; He, M; Allen, AS; Satten, GA

Published Date

  • April 2012

Published In

Volume / Issue

  • 36 / 3

Start / End Page

  • 195 - 205

PubMed ID

  • 22714934

Pubmed Central ID

  • PMC3671578

Electronic International Standard Serial Number (EISSN)

  • 1098-2272

Digital Object Identifier (DOI)

  • 10.1002/gepi.21611


  • eng

Conference Location

  • United States