Skip to main content
Journal cover image

Incorporating nearest-neighbor site dependence into protein evolution models

Publication ,  Conference
Larson, G; Thorne, JL; Schmidler, S
Published in: Journal of Computational Biology
March 1, 2020

Evolutionary models of proteins are widely used for statistical sequence alignment and inference of homology and phylogeny. However, the vast majority of these models rely on an unrealistic assumption of independent evolution between sites. Here we focus on the related problem of protein structure alignment, a classic tool of computational biology that is widely used to identify structural and functional similarity and to infer homology among proteins. A site-independent statistical model for protein structural evolution has previously been introduced and shown to significantly improve alignments and phylogenetic inferences compared with approaches that utilize only amino acid sequence information. Here we extend this model to account for correlated evolutionary drift among neighboring amino acid positions. The result is a spatiotemporal model of protein structure evolution, described by a multivariate diffusion process convolved with a spatial birth-death process. This extended site-dependent model (SDM) comes with little additional computational cost or analytical complexity compared with the site-independent model (SIM). We demonstrate that this SDM yields a significant reduction of bias in estimated evolutionary distances and helps further improve phylogenetic tree reconstruction. We also develop a simple model of site-dependent sequence evolution, which we use to demonstrate the bias resulting from the application of standard site-independent sequence evolution models.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Journal of Computational Biology

DOI

ISSN

1066-5277

Publication Date

March 1, 2020

Volume

27

Issue

3

Start / End Page

361 / 375

Related Subject Headings

  • Structural Homology, Protein
  • Sequence Analysis, Protein
  • Sequence Alignment
  • Proteins
  • Models, Statistical
  • Evolution, Molecular
  • Computational Biology
  • Bioinformatics
  • 49 Mathematical sciences
  • 46 Information and computing sciences
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Larson, G., Thorne, J. L., & Schmidler, S. (2020). Incorporating nearest-neighbor site dependence into protein evolution models. Journal of Computational Biology, 27(3), 361–375. https://doi.org/10.1089/cmb.2019.0500
Larson, G., J. L. Thorne, and S. Schmidler. “Incorporating nearest-neighbor site dependence into protein evolution models.” Journal of Computational Biology 27, no. 3 (March 1, 2020): 361–75. https://doi.org/10.1089/cmb.2019.0500.
Larson G, Thorne JL, Schmidler S. Incorporating nearest-neighbor site dependence into protein evolution models. Journal of Computational Biology. 2020 Mar 1;27(3):361–75.
Larson, G., et al. “Incorporating nearest-neighbor site dependence into protein evolution models.” Journal of Computational Biology, vol. 27, no. 3, Mar. 2020, pp. 361–75. Manual, doi:10.1089/cmb.2019.0500.
Larson G, Thorne JL, Schmidler S. Incorporating nearest-neighbor site dependence into protein evolution models. Journal of Computational Biology. 2020 Mar 1;27(3):361–375.
Journal cover image

Published In

Journal of Computational Biology

DOI

ISSN

1066-5277

Publication Date

March 1, 2020

Volume

27

Issue

3

Start / End Page

361 / 375

Related Subject Headings

  • Structural Homology, Protein
  • Sequence Analysis, Protein
  • Sequence Alignment
  • Proteins
  • Models, Statistical
  • Evolution, Molecular
  • Computational Biology
  • Bioinformatics
  • 49 Mathematical sciences
  • 46 Information and computing sciences