D-SCRIPT translates genome to phenome with sequence-based, structure-aware, genome-scale predictions of protein-protein interactions.
We combine advances in neural language modeling and structurally motivated design to develop D-SCRIPT, an interpretable and generalizable deep-learning model, which predicts interaction between two proteins using only their sequence and maintains high accuracy with limited training data and across species. We show that a D-SCRIPT model trained on 38,345 human PPIs enables significantly improved functional characterization of fly proteins compared with the state-of-the-art approach. Evaluating the same D-SCRIPT model on protein complexes with known 3D structure, we find that the inter-protein contact map output by D-SCRIPT has significant overlap with the ground truth. We apply D-SCRIPT to screen for PPIs in cow (Bos taurus) at a genome-wide scale and focusing on rumen physiology, identify functional gene modules related to metabolism and immune response. The predicted interactions can then be leveraged for function prediction at scale, addressing the genome-to-phenome challenge, especially in species where little data are available.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Proteins
- Phenomics
- Cattle
- Animals
- 3101 Biochemistry and cell biology
- 0601 Biochemistry and Cell Biology
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Start / End Page
Location
Related Subject Headings
- Proteins
- Phenomics
- Cattle
- Animals
- 3101 Biochemistry and cell biology
- 0601 Biochemistry and Cell Biology