Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana.
Genetic markers such as single nucleotide polymorphisms (SNPs) are essential tools for positional cloning, association, or quantitative trait locus mapping and the determination of genetic relationships between individuals. We identified and characterized a genome-wide set of SNP markers by generating 10,706 expressed sequence tags (ESTs) from cDNA libraries derived from 6 different accessions, and by analysis of 606 sequence tagged sites (STS) from up to 12 accessions of the model flowering plant Arabidopsis thaliana. The cDNA libraries for EST sequencing were made from individuals that were stressed by various means to enrich for transcripts from genes expressed under such conditions. SNPs discovered in these sequences may be useful markers for mapping genes involved in interactions with the biotic and abiotic environment. The STS loci are distributed randomly over the genome. By comparison with the Col-0 genome sequence, we identified a total of 8051 SNPs and 637 insertion/deletion polymorphisms (InDel). Analysis of STS-derived SNPs shows that most SNPs are rare, but that it is possible to identify intermediate frequency framework markers that can be used for genetic mapping in many different combinations of accessions. A substantial proportion of SNPs located in ORFs caused a change of the encoded amino acid. A comparison of the density of our SNP markers among accessions in both the EST and STS datasets, revealed that Cvi-0 is the most divergent accession from Col-0 among the 12 accessions studied. All of these markers are freely available via the internet.
Schmid, KJ; Sorensen, TR; Stracke, R; Torjek, O; Altmann, T; Mitchell-Olds, T; Weisshaar, B
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)