Molprobity's ultimate rotamer-library distributions for model validation.

Published

Journal Article

Here we describe the updated MolProbity rotamer-library distributions derived from an order-of-magnitude larger and more stringently quality-filtered dataset of about 8000 (vs. 500) protein chains, and we explain the resulting changes and improvements to model validation as seen by users. To include only side-chains with satisfactory justification for their given conformation, we added residue-specific filters for electron-density value and model-to-density fit. The combined new protocol retains a million residues of data, while cleaning up false-positive noise in the multi- χ datapoint distributions. It enables unambiguous characterization of conformational clusters nearly 1000-fold less frequent than the most common ones. We describe examples of local interactions that favor these rare conformations, including the role of authentic covalent bond-angle deviations in enabling presumably strained side-chain conformations. Further, along with favored and outlier, an allowed category (0.3-2.0% occurrence in reference data) has been added, analogous to Ramachandran validation categories. The new rotamer distributions are used for current rotamer validation in MolProbity and PHENIX, and for rotamer choice in PHENIX model-building and refinement. The multi-dimensional χ distributions and Top8000 reference dataset are freely available on GitHub. These rotamers are termed "ultimate" because data sampling and quality are now fully adequate for this task, and also because we believe the future of conformational validation should integrate side-chain with backbone criteria. Proteins 2016; 84:1177-1189. © 2016 Wiley Periodicals, Inc.

Full Text

Duke Authors

Cited Authors

  • Hintze, BJ; Lewis, SM; Richardson, JS; Richardson, DC

Published Date

  • September 2016

Published In

Volume / Issue

  • 84 / 9

Start / End Page

  • 1177 - 1189

PubMed ID

  • 27018641

Pubmed Central ID

  • 27018641

Electronic International Standard Serial Number (EISSN)

  • 1097-0134

Digital Object Identifier (DOI)

  • 10.1002/prot.25039

Language

  • eng

Conference Location

  • United States