Assessment of detailed conformations suggests strategies for improving cryoEM models: Helix at lower resolution, ensembles, pre-refinement fixups, and validation at multi-residue length scale.

Published

Journal Article

We find that the overall quite good methods used in the CryoEM Model Challenge could still benefit greatly from several strategies for improving local conformations. Our assessments primarily use validation criteria from the MolProbity web service. Those criteria include MolProbity's all-atom contact analysis, updated versions of standard conformational validations for protein and RNA, plus two recent additions: first, flags for cis-nonPro and twisted peptides, and second, the CaBLAM system for diagnosing secondary structure, validating Cα backbone, and validating adjacent peptide CO orientations in the context of the Cα trace. In general, automated ab initio building of starting models is quite good at backbone connectivity but often fails at local conformation or sequence register, especially at poorer than 3.5 Å resolution. However, we show that even if criteria (such as Ramachandran or rotamer) are explicitly restrained to improve refinement behavior and overall validation scores, automated optimization of a deposited structure seldom corrects specific misfittings that start in the wrong local minimum, but just hides them. Therefore, local problems should be identified, and as many as possible corrected, before starting refinement. Secondary structures are confusing at 3-4 Å but can be better recognized at 6-8 Å. In future model challenges, specific steps being tested (such as segmentation) and the required documentation (such as PDB code of starting model) should each be explicitly defined, so competing methods on a given task can be meaningfully compared. Individual local examples are presented here, to understand what local mistakes and corrections look like in 3D, how they probably arise, and what possible improvements to methodology might help avoid them. At these resolutions, both structural biologists and end-users need meaningful estimates of local uncertainty, perhaps through explicit ensembles. Fitting problems can best be diagnosed by validation that spans multiple residues; CaBLAM is such a multi-residue tool, and its effectiveness is demonstrated.

Full Text

Duke Authors

Cited Authors

  • Richardson, JS; Williams, CJ; Videau, LL; Chen, VB; Richardson, DC

Published Date

  • November 2018

Published In

Volume / Issue

  • 204 / 2

Start / End Page

  • 301 - 312

PubMed ID

  • 30107233

Pubmed Central ID

  • 30107233

Electronic International Standard Serial Number (EISSN)

  • 1095-8657

International Standard Serial Number (ISSN)

  • 1047-8477

Digital Object Identifier (DOI)

  • 10.1016/j.jsb.2018.08.007

Language

  • eng