Secondary analysis of case-control association studies: Insights on weighting-based inference motivate a new specification test.


Journal Article

Case-control sampling is frequently used in genetic association studies to examine the relationship between disease and genetic exposures. Such designs usually collect extensive information on phenotypes beyond the primary disease, whose associations with the genetic exposures are also of great interest. Because the cases are over-sampled, appropriate analysis of secondary phenotypes should take into account this biased sampling design. We previously introduced a weighting-based estimator for appropriate secondary analysis, but have not thoroughly explored its statistical properties. In this article, we revisit our previous estimator to offer new insights and methodological extensions. Specifically, we extend our previous estimator and construct its more general form based on generalized least squares (GLS). Such an extension allows us to connect the GLS estimator with the generalized method of moments and motivates a new specification test designed to assess the adequacy of the disease model or the weights. The specification test statistic measures the weighted discrepancy between the case and control subsample estimators, and asymptotically follows a central Chi-squared distribution under correct disease model specification. We illustrate the GLS estimator and specification test using a case-control sample of peripheral arterial disease, and use simulations to further shed light on the operating characteristics of the specification test.

Full Text

Duke Authors

Cited Authors

  • Li, F; Allen, AS

Published Date

  • September 30, 2020

Published In

Volume / Issue

  • 39 / 22

Start / End Page

  • 2869 - 2882

PubMed ID

  • 32501597

Pubmed Central ID

  • 32501597

Electronic International Standard Serial Number (EISSN)

  • 1097-0258

Digital Object Identifier (DOI)

  • 10.1002/sim.8579


  • eng

Conference Location

  • England