Skip to main content
Journal cover image

Progressive Cactus is a multiple-genome aligner for the thousand-genome era.

Publication ,  Journal Article
Armstrong, J; Hickey, G; Diekhans, M; Fiddes, IT; Novak, AM; Deran, A; Fang, Q; Xie, D; Feng, S; Stiller, J; Genereux, D; Johnson, J ...
Published in: Nature
November 2020

New genome assemblies have been arriving at a rapidly increasing pace, thanks to decreases in sequencing costs and improvements in third-generation sequencing technologies1-3. For example, the number of vertebrate genome assemblies currently in the NCBI (National Center for Biotechnology Information) database4 increased by more than 50% to 1,485 assemblies in the year from July 2018 to July 2019. In addition to this influx of assemblies from different species, new human de novo assemblies5 are being produced, which enable the analysis of not only small polymorphisms, but also complex, large-scale structural differences between human individuals and haplotypes. This coming era and its unprecedented amount of data offer the opportunity to uncover many insights into genome evolution but also present challenges in how to adapt current analysis methods to meet the increased scale. Cactus6, a reference-free multiple genome alignment program, has been shown to be highly accurate, but the existing implementation scales poorly with increasing numbers of genomes, and struggles in regions of highly duplicated sequences. Here we describe progressive extensions to Cactus to create Progressive Cactus, which enables the reference-free alignment of tens to thousands of large vertebrate genomes while maintaining high alignment quality. We describe results from an alignment of more than 600 amniote genomes, which is to our knowledge the largest multiple vertebrate genome alignment created so far.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Nature

DOI

EISSN

1476-4687

Publication Date

November 2020

Volume

587

Issue

7833

Start / End Page

246 / 251

Location

England

Related Subject Headings

  • Vertebrates
  • Software
  • Sequence Alignment
  • Quality Control
  • Humans
  • Haplotypes
  • Genomics
  • Genome
  • General Science & Technology
  • Computer Simulation
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Armstrong, J., Hickey, G., Diekhans, M., Fiddes, I. T., Novak, A. M., Deran, A., … Paten, B. (2020). Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature, 587(7833), 246–251. https://doi.org/10.1038/s41586-020-2871-y
Armstrong, Joel, Glenn Hickey, Mark Diekhans, Ian T. Fiddes, Adam M. Novak, Alden Deran, Qi Fang, et al. “Progressive Cactus is a multiple-genome aligner for the thousand-genome era.Nature 587, no. 7833 (November 2020): 246–51. https://doi.org/10.1038/s41586-020-2871-y.
Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020 Nov;587(7833):246–51.
Armstrong, Joel, et al. “Progressive Cactus is a multiple-genome aligner for the thousand-genome era.Nature, vol. 587, no. 7833, Nov. 2020, pp. 246–51. Pubmed, doi:10.1038/s41586-020-2871-y.
Armstrong J, Hickey G, Diekhans M, Fiddes IT, Novak AM, Deran A, Fang Q, Xie D, Feng S, Stiller J, Genereux D, Johnson J, Marinescu VD, Alföldi J, Harris RS, Lindblad-Toh K, Haussler D, Karlsson E, Jarvis ED, Zhang G, Paten B. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature. 2020 Nov;587(7833):246–251.
Journal cover image

Published In

Nature

DOI

EISSN

1476-4687

Publication Date

November 2020

Volume

587

Issue

7833

Start / End Page

246 / 251

Location

England

Related Subject Headings

  • Vertebrates
  • Software
  • Sequence Alignment
  • Quality Control
  • Humans
  • Haplotypes
  • Genomics
  • Genome
  • General Science & Technology
  • Computer Simulation