Assessment of detailed conformations suggests strategies for improving cryoEM models: Helix at lower resolution, ensembles, pre-refinement fixups, and validation at multi-residue length scale.
We find that the overall quite good methods used in the CryoEM Model Challenge could still benefit greatly from several strategies for improving local conformations. Our assessments primarily use validation criteria from the MolProbity web service. Those criteria include MolProbity's all-atom contact analysis, updated versions of standard conformational validations for protein and RNA, plus two recent additions: first, flags for cis-nonPro and twisted peptides, and second, the CaBLAM system for diagnosing secondary structure, validating Cα backbone, and validating adjacent peptide CO orientations in the context of the Cα trace. In general, automated ab initio building of starting models is quite good at backbone connectivity but often fails at local conformation or sequence register, especially at poorer than 3.5 Å resolution. However, we show that even if criteria (such as Ramachandran or rotamer) are explicitly restrained to improve refinement behavior and overall validation scores, automated optimization of a deposited structure seldom corrects specific misfittings that start in the wrong local minimum, but just hides them. Therefore, local problems should be identified, and as many as possible corrected, before starting refinement. Secondary structures are confusing at 3-4 Å but can be better recognized at 6-8 Å. In future model challenges, specific steps being tested (such as segmentation) and the required documentation (such as PDB code of starting model) should each be explicitly defined, so competing methods on a given task can be meaningfully compared. Individual local examples are presented here, to understand what local mistakes and corrections look like in 3D, how they probably arise, and what possible improvements to methodology might help avoid them. At these resolutions, both structural biologists and end-users need meaningful estimates of local uncertainty, perhaps through explicit ensembles. Fitting problems can best be diagnosed by validation that spans multiple residues; CaBLAM is such a multi-residue tool, and its effectiveness is demonstrated.
Richardson, JS; Williams, CJ; Videau, LL; Chen, VB; Richardson, DC
Volume / Issue
Start / End Page
Pubmed Central ID
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)