Effect of population stratification on the identification of significant single-nucleotide polymorphisms in genome-wide association studies.
The North American Rheumatoid Arthritis Consortium case-control study collected case participants across the United States and control participants from New York. More than 500,000 single-nucleotide polymorphisms (SNPs) were genotyped in the sample of 2000 cases and controls. Careful adjustment for the confounding effect of population stratification must be conducted when analyzing these data; the variance inflation factor (VIF) without adjustment is 1.44. In the primary analyses of these data, a clustering algorithm in the program PLINK was used to reduce the VIF to 1.14, after which genomic control was used to control residual confounding. Here we use stratification scores to achieve a unified and coherent control for confounding. We used the first 10 principal components, calculated genome-wide using a set of 81,500 loci that had been selected to have low pair-wise linkage disequilibrium, as risk factors in a logistic model to calculate the stratification score. We then divided the data into five strata based on quantiles of the stratification score. The VIF of these stratified data is 1.04, indicating substantial control of stratification. However, after control for stratification, we find that there are no significant loci associated with rheumatoid arthritis outside of the HLA region. In particular, we find no evidence for association of TRAF1-C5 with rheumatoid arthritis.