Genetic predisposition to childhood acute lymphoblastic leukemia (ALL) is compellingly evidenced by recent genome-wide association studies (GWAS) identifying ARID5B, IKZF1, CEBPE, and CDKN2A/B as ALL susceptibility loci. However, these 4 loci cumulatively accounted for only 8% of genetic variability in ALL risk, suggesting additional susceptibility variants yet to be identified in larger studies. Moreover, ALL GWAS has been exclusively restricted to populations of European descent, and the genetic basis of ALL susceptibility in the context of diverse ethnic background is largely unknown. This is of particular importance because the incidence of ALL varies substantially by ethnicity.
Taking a multi-ethnic GWAS approach, we compared genotype frequency at 709,509 germline single nucleotide polymorphisms (SNPs) between 1,605 children with ALL and 6,661 controls of European, African, and Native American genetic ancestry (i.e., European American [EA], African American [AA], and Hispanics). After adjusting for population structures, 4 loci reached genome-wide significance threshold of P<5×10−8: 10q21.2 (ARID5B, P=5.9×10−46), 7p12.2 (IKZF1, P=5.3×10−24), 14q11.2 (CEBPE, P=9×10−12), as previously reported; and a novel ALL susceptibility locus at 10p12.31-12.2 (PIP4K2A, P=1.1×10−11). While ARID5B, IKZF1, and PIP4K2A SNPs were associated with ALL across different ethnicities, the association at the CEBPE locus was more specific to EAs. ALL risk variants at ARID5B and PIP4K2A SNPs were most common in Hispanics followed by EAs, and least common in AAs, in parallel with racial differences in the incidence of childhood ALL. We also performed multivariate analyses to determine the extent to which SNPs contribute independently to ALL susceptibility at each of the 4 loci. While associations at ARID5B and CEBPE loci were completely explained by the respective top SNP within each region, IKZF1 and PIP4K2A loci harbored multiple independent association signals. All 4 loci were validated in 3 independent replication series at P<0.05 level in multiple ethnic groups: EAs, 574 cases and 2,601 controls; AAs, 128 cases and 1,075 controls; Hispanics, 143 cases and 640 controls. For example, the top PIP4K2A SNP was significant in all 3 replication studies: EAs, P=0.0017; AAs, P=0.009; and Hispanics, P=0.041. We next examined the cumulative effects of these 4 loci on ALL susceptibility by multi-marker analyses of top SNPs at ARID5B, IKZF1, PIP4K2A, and CEBPE. In the combined discovery and replication cohorts (2,450 cases and 10,977 controls), the number of risk alleles at these 4 SNPs (genetic risk burden) was positively correlated with relative ALL risk, e.g., subjects with 6–8 copies of risk alleles (252 cases and 314 controls) were at 9.0-fold (95% confidence interval, 6.9–11.8) higher risk of developing ALL than those with 0–1 copy of the risk alleles (153 cases and 1,753 controls). Interestingly, every copy of allele C at the ARID5BSNP rs10821936 conferred a 1.93-fold increase (95% CI, 1.8–2.08) in the risk of developing ALL in children less than 10 years old (N=1,947) whereas the added disease risk by each C allele was 1.48-fold (95% CI, 1.3–1.68) in children older than 10 (N=503), implying plausible modifying effects of age on genetic predisposition to childhood ALL. The association between rs10821936 SNP genotype and age at ALL diagnosis remained significant even after adjusting for molecular subtypes.
In conclusion, we reported the first multi-ethnic GWAS of childhood ALL in which we comprehensively examined the role of inherited genetic variation in ALL susceptibility in diverse populations and identified a novel susceptibility locus at 10p12.31-12.2. Our results not only shed new light on molecular etiology of childhood ALL, but also on the genetic basis of racial differences in ALL incidence.
No relevant conflicts of interest to declare.