Active learning for computational chemogenomics.
Computational chemogenomics models the compound-protein interaction space, typically for drug discovery, where existing methods predominantly either incorporate increasing numbers of bioactivity samples or focus on specific subfamilies of proteins and ligands. As an alternative to modeling entire large datasets at once, active learning adaptively incorporates a minimum of informative examples for modeling, yielding compact but high quality models. Results/methodology: We assessed active learning for protein/target family-wide chemogenomic modeling by replicate experiment. Results demonstrate that small yet highly predictive models can be extracted from only 10-25% of large bioactivity datasets, irrespective of molecule descriptors used.Chemogenomic active learning identifies small subsets of ligand-target interactions in a large screening database that lead to knowledge discovery and highly predictive models.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Proteins
- Models, Chemical
- Medicinal & Biomolecular Chemistry
- Machine Learning
- Ligands
- Genomics
- Drug Discovery
- Databases, Chemical
- Computer Simulation
- Computational Biology
Citation
Published In
DOI
EISSN
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Proteins
- Models, Chemical
- Medicinal & Biomolecular Chemistry
- Machine Learning
- Ligands
- Genomics
- Drug Discovery
- Databases, Chemical
- Computer Simulation
- Computational Biology