The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support.

Journal Article

The use of parameter-rich substitution models in molecular phylogenetics has been criticized on the basis that these models can cause a reduction both in accuracy and in the ability to discriminate among competing topologies. We have explored the relationship between nucleotide substitution model complexity and nonparametric bootstrap support under maximum likelihood (ML) for six data sets for which the true relationships are known with a high degree of certainty. We also performed equally weighted maximum parsimony analyses in order to assess the effects of ignoring branch length information during tree selection. We observed that maximum parsimony gave the lowest mean estimate of bootstrap support for the correct set of nodes relative to the ML models for every data set except one. For several data sets, we established that the exact distribution used to model among-site rate variation was critical for a successful phylogenetic analysis. Site-specific rate models were shown to perform very poorly relative to gamma and invariable sites models for several of the data sets most likely because of the gross underestimation of branch lengths. The invariable sites model also performed poorly for several data sets where this model had a poor fit to the data, suggesting that addition of the gamma distribution can be critical. Estimates of bootstrap support for the correct nodes often increased under gamma and invariable sites models relative to equal rates models. Our observations are contrary to the prediction that such models cause reduced confidence in phylogenetic hypotheses. Our results raise several issues regarding the process of model selection, and we briefly discuss model selection uncertainty and the role of sensitivity analyses in molecular phylogenetics.

Full Text

Duke Authors

Cited Authors

  • Buckley, TR; Cunningham, CW

Published Date

  • April 2002

Published In

Volume / Issue

  • 19 / 4

Start / End Page

  • 394 - 405

PubMed ID

  • 11919280

International Standard Serial Number (ISSN)

  • 0737-4038

Language

  • eng

Conference Location

  • United States