Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.

Journal Article (Journal Article)

For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient likelihood computations under the resulting joint sequence-structure model. We observe that the inclusion of structural information significantly reduces alignment and topology uncertainty, and reduces the number of topology and alignment errors in cases where the true trees and alignments are known. In some cases, the inclusion of structure results in changes to the consensus topology, indicating that structure may contain additional information beyond that which can be obtained from sequences. We use the model to investigate the order of divergence of cytoglobins, myoglobins, and hemoglobins and observe a stabilization of phylogenetic inference: although a sequence-based inference assigns significant posterior probability to several different topologies, the structural model strongly favors one of these over the others and is more robust to the choice of data set.

Full Text

Duke Authors

Cited Authors

  • Herman, JL; Challis, CJ; Novák, Á; Hein, J; Schmidler, SC

Published Date

  • September 2014

Published In

Volume / Issue

  • 31 / 9

Start / End Page

  • 2251 - 2266

PubMed ID

  • 24899668

Pubmed Central ID

  • PMC4137710

Electronic International Standard Serial Number (EISSN)

  • 1537-1719

International Standard Serial Number (ISSN)

  • 0737-4038

Digital Object Identifier (DOI)

  • 10.1093/molbev/msu184


  • eng