Skip to main content
Journal cover image

The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions.

Publication ,  Journal Article
Richardson, JS; Williams, CJ; Chen, VB; Prisant, MG; Richardson, DC
Published in: Acta Crystallogr D Struct Biol
December 1, 2023

Model building and refinement, and the validation of their correctness, are very effective and reliable at local resolutions better than about 2.5 Å for both crystallography and cryo-EM. However, at local resolutions worse than 2.5 Å both the procedures and their validation break down and do not ensure reliably correct models. This is because in the broad density at lower resolution, critical features such as protein backbone carbonyl O atoms are not just less accurate but are not seen at all, and so peptide orientations are frequently wrongly fitted by 90-180°. This puts both backbone and side chains into the wrong local energy minimum, and they are then worsened rather than improved by further refinement into a valid but incorrect rotamer or Ramachandran region. On the positive side, new tools are being developed to locate this type of pernicious error in PDB depositions, such as CaBLAM, EMRinger, Pperp diagnosis of ribose puckers, and peptide flips in PDB-REDO, while interactive modeling in Coot or ISOLDE can help to fix many of them. Another positive trend is that artificial intelligence predictions such as those made by AlphaFold2 contribute additional evidence from large multiple sequence alignments, and in high-confidence parts they provide quite good starting models for loops, termini or whole domains with otherwise ambiguous density.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Acta Crystallogr D Struct Biol

DOI

EISSN

2059-7983

Publication Date

December 1, 2023

Volume

79

Issue

Pt 12

Start / End Page

1071 / 1078

Location

United States

Related Subject Headings

  • Proteins
  • Protein Conformation
  • Peptides
  • Models, Molecular
  • Crystallography, X-Ray
  • Cryoelectron Microscopy
  • Artificial Intelligence
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Richardson, J. S., Williams, C. J., Chen, V. B., Prisant, M. G., & Richardson, D. C. (2023). The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions. Acta Crystallogr D Struct Biol, 79(Pt 12), 1071–1078. https://doi.org/10.1107/S2059798323008847
Richardson, Jane S., Christopher J. Williams, Vincent B. Chen, Michael G. Prisant, and David C. Richardson. “The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions.Acta Crystallogr D Struct Biol 79, no. Pt 12 (December 1, 2023): 1071–78. https://doi.org/10.1107/S2059798323008847.
Richardson JS, Williams CJ, Chen VB, Prisant MG, Richardson DC. The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions. Acta Crystallogr D Struct Biol. 2023 Dec 1;79(Pt 12):1071–8.
Richardson, Jane S., et al. “The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions.Acta Crystallogr D Struct Biol, vol. 79, no. Pt 12, Dec. 2023, pp. 1071–78. Pubmed, doi:10.1107/S2059798323008847.
Richardson JS, Williams CJ, Chen VB, Prisant MG, Richardson DC. The bad and the good of trends in model building and refinement for sparse-data regions: pernicious forms of overfitting versus good new tools and predictions. Acta Crystallogr D Struct Biol. 2023 Dec 1;79(Pt 12):1071–1078.
Journal cover image

Published In

Acta Crystallogr D Struct Biol

DOI

EISSN

2059-7983

Publication Date

December 1, 2023

Volume

79

Issue

Pt 12

Start / End Page

1071 / 1078

Location

United States

Related Subject Headings

  • Proteins
  • Protein Conformation
  • Peptides
  • Models, Molecular
  • Crystallography, X-Ray
  • Cryoelectron Microscopy
  • Artificial Intelligence