Contemporary Considerations for Constructing a Genetic Risk Score: An Empirical Approach.
Genetic risk scores are an increasingly popular tool for summarizing the cumulative risk of a set of Single Nucleotide Polymorphisms (SNPs) with disease. Typically only the set of the SNPs that have reached genome-wide significance compose these scores. However recent work suggests that including additional SNPs may aid risk assessment. In this paper, we used the Atherosclerosis Risk in Communities (ARIC) Study cohort to illustrate how one can choose the optimal set of SNPs for a genetic risk score (GRS). In addition to P-value threshold, we also examined linkage disequilibrium, imputation quality, and imputation type. We provide a variety of evaluation metrics. Results suggest that P-value threshold had the greatest impact on GRS quality for the outcome of coronary heart disease, with an optimal threshold around 0.001. However, GRSs are relatively robust to both linkage disequilibrium and imputation quality. We also show that the optimal GRS partially depends on the evaluation metric and consequently the way one intends to use the GRS. Overall the implications highlight both the robustness of GRS and a means to empirically choose the best set of GRSs.
Goldstein, BA; Yang, L; Salfati, E; Assimes, TL
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
Digital Object Identifier (DOI)