Hash: A program to accurately predict protein Hα shifts from neighboring backbone shifts

Journal Article

Chemical shifts provide not only peak identities for analyzing nuclear magnetic resonance (NMR) data, but also an important source of conformational information for studying protein structures. Current structural studies requiring Hα chemical shifts suffer from the following limitations. (1) For large proteins, the Hα chemical shifts can be difficult to assign using conventional NMR triple-resonance experiments, mainly due to the fast transverse relaxation rate of Cα that restricts the signal sensitivity. (2) Previous chemical shift prediction approaches either require homologous models with high sequence similarity or rely heavily on accurate backbone and side-chain structural coordinates. When neither sequence homologues nor structural coordinates are available, we must resort to other information to predict Hα chemical shifts. Predicting accurate Hα chemical shifts using other obtainable information, such as the chemical shifts of nearby backbone atoms (i.e., adjacent atoms in the sequence), can remedy the above dilemmas, and hence advance NMR-based structural studies of proteins. By specifically exploiting the dependencies on chemical shifts of nearby backbone atoms, we propose a novel machine learning algorithm, called Hash, to predict Hα chemical shifts. Hash combines a new fragment-based chemical shift search approach with a non-parametric regression model, called the generalized additive model, to effectively solve the prediction problem. We demonstrate that the chemical shifts of nearby backbone atoms provide a reliable source of information for predicting accurate Hα chemical shifts. Our testing results on different possible combinations of input data indicate that Hash has a wide rage of potential NMR applications in structural and biological studies of proteins. © 2012 Springer Science+Business Media Dordrecht.

Full Text

Duke Authors

Cited Authors

  • Zeng, J; Zhou, P; Donald, BR

Published Date

  • 2013

Published In

Volume / Issue

  • 55 / 1

Start / End Page

  • 105 - 118

International Standard Serial Number (ISSN)

  • 0925-2738

Digital Object Identifier (DOI)

  • 10.1007/s10858-012-9693-7