Learning in glaucoma genetic risk assessment.
Genome Wide Association (GWA) studies are powerful tools to identify genes involved in common human diseases, and are becoming increasingly important in genetic epidemiology research. However, the statistical approaches behind GWA studies lack capability in taking into account the possible interactions among genetic markers; and true disease variants may be lost in statistical noise due to high threshold. A typical GWA study reports a few highly suspected signals, e.g. Single-nucleotide polymorphisms (SNPs), which usually account for a tiny portion of overall genetic risks for the disease of interest. This study proposes a computational learning approach in addition to parametric statistical methods along with a filtering mechanism, to build glaucoma genetic risk assessment model. Our data set was obtained from Singapore Malay Eye Study (SiMES), genotyped on Illumina 610 quad arrays. We constructed case-control data set with 233 glaucoma and 458 healthy samples. A standard case-control association test was conducted on post-QC dataset with more than 500k SNPs. Genetic profile is constructed using genotype information from a list of 412 SNPs filtered by a relaxed pvalue threshold of 1 × 10(-3), and forms the feature space for learning. Among the five learning algorithms we performed, Support Vector Machines with radial kernel (SVM-radial) achieved the best result, with area under curve (ROC) of 99.4% and accuracy of 95.9%. The result illustrates that, learning approach in post GWAS data analysis is able to accurately assess genetic risk for glaucoma. The approach is more robust and comprehensive than individual SNPs matching method. We will further validate our results in several other data sets obtained in consequential population studies conducted in Singapore.
Zhang, Z; Liu, J; Kwoh, CK; Sim, X; Tay, WT; Tan, Y; Yin, F; Wong, TY
Volume / Issue
Start / End Page
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)