TT3D: Leveraging precomputed protein 3D sequence models to predict protein-protein interactions.
MOTIVATION: High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di). RESULTS: We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein-protein interactions cross-species. Thus TT3D (Topsy-Turvy 3D) presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary protein-protein interaction predictions across all protein pairs can be made genome-wide. AVAILABILITY AND IMPLEMENTATION: TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
Publication Date
Volume
Issue
Location
Related Subject Headings
- Software
- Proteins
- Bioinformatics
- Amino Acid Sequence
- 49 Mathematical sciences
- 46 Information and computing sciences
- 31 Biological sciences
- 08 Information and Computing Sciences
- 06 Biological Sciences
- 01 Mathematical Sciences
Citation
Published In
DOI
EISSN
Publication Date
Volume
Issue
Location
Related Subject Headings
- Software
- Proteins
- Bioinformatics
- Amino Acid Sequence
- 49 Mathematical sciences
- 46 Information and computing sciences
- 31 Biological sciences
- 08 Information and Computing Sciences
- 06 Biological Sciences
- 01 Mathematical Sciences