Scholars@Duke publication: Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.

Publication , Journal Article

Delaneau, O; Marchini, J; 1000 Genomes Project Consortium,

Published in: Nature communications

June 2014

A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

Duke Scholars

Author Charmaine DM Royal African & African American Studies

Altmetric Attention Stats

Dimensions Citation Stats

Published In

Nature communications

DOI

10.1038/ncomms4934

EISSN

2041-1723

ISSN

2041-1723

Publication Date

June 2014

Volume

Start / End Page

3934

Related Subject Headings

Polymorphism, Single Nucleotide
Microarray Analysis
Humans
Haplotypes
Genome-Wide Association Study
Genome, Human
Gene Frequency
Alleles
Algorithms

Citation

APA

Chicago

ICMJE

MLA

NLM

Delaneau, O., Marchini, J., & 1000 Genomes Project Consortium, . (2014). Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nature Communications, 5, 3934. https://doi.org/10.1038/ncomms4934

Delaneau, Olivier, Jonathan Marchini, and Jonathan 1000 Genomes Project Consortium. “Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.” Nature Communications 5 (June 2014): 3934. https://doi.org/10.1038/ncomms4934.

Delaneau O, Marchini J, 1000 Genomes Project Consortium. Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel. Nature communications. 2014 Jun;5:3934.

Delaneau, Olivier, et al. “Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.” Nature Communications, vol. 5, June 2014, p. 3934. Epmc, doi:10.1038/ncomms4934.

Published In

Nature communications

DOI

10.1038/ncomms4934

EISSN

2041-1723

ISSN

2041-1723

Publication Date

June 2014

Volume

Start / End Page

3934

Related Subject Headings

Polymorphism, Single Nucleotide
Microarray Analysis
Humans
Haplotypes
Genome-Wide Association Study
Genome, Human
Gene Frequency
Alleles
Algorithms