Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.
Journal Article (Journal Article)
For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient likelihood computations under the resulting joint sequence-structure model. We observe that the inclusion of structural information significantly reduces alignment and topology uncertainty, and reduces the number of topology and alignment errors in cases where the true trees and alignments are known. In some cases, the inclusion of structure results in changes to the consensus topology, indicating that structure may contain additional information beyond that which can be obtained from sequences. We use the model to investigate the order of divergence of cytoglobins, myoglobins, and hemoglobins and observe a stabilization of phylogenetic inference: although a sequence-based inference assigns significant posterior probability to several different topologies, the structural model strongly favors one of these over the others and is more robust to the choice of data set.
Full Text
Duke Authors
Cited Authors
- Herman, JL; Challis, CJ; Novák, Á; Hein, J; Schmidler, SC
Published Date
- September 2014
Published In
Volume / Issue
- 31 / 9
Start / End Page
- 2251 - 2266
PubMed ID
- 24899668
Pubmed Central ID
- PMC4137710
Electronic International Standard Serial Number (EISSN)
- 1537-1719
International Standard Serial Number (ISSN)
- 0737-4038
Digital Object Identifier (DOI)
- 10.1093/molbev/msu184
Language
- eng