Skip to main content

Finding the most potent compounds using active learning on molecular pairs.

Publication ,  Journal Article
Fralish, Z; Reker, D
Published in: Beilstein journal of organic chemistry
January 2024

Active learning allows algorithms to steer iterative experimentation to accelerate and de-risk molecular optimizations, but actively trained models might still exhibit poor performance during early project stages where the training data is limited and model exploitation might lead to analog identification with limited scaffold diversity. Here, we present ActiveDelta, an adaptive approach that leverages paired molecular representations to predict improvements from the current best training compound to prioritize further data acquisition. We apply the ActiveDelta concept to both graph-based deep (Chemprop) and tree-based (XGBoost) models during exploitative active learning for 99 Ki benchmarking datasets. We show that both ActiveDelta implementations excel at identifying more potent inhibitors compared to the standard exploitative active learning implementations of Chemprop, XGBoost, and Random Forest. The ActiveDelta approach is also able to identify more chemically diverse inhibitors in terms of their Murcko scaffolds. Finally, deep models such as Chemprop trained on data selected through ActiveDelta approaches can more accurately identify inhibitors in test data created through simulated time-splits. Overall, this study highlights the large potential for molecular pairing approaches to further improve popular active learning strategies in low data regimes by enabling faster and more accurate identification of more diverse molecular hits against critical drug targets.

Duke Scholars

Published In

Beilstein journal of organic chemistry

DOI

EISSN

1860-5397

ISSN

1860-5397

Publication Date

January 2024

Volume

20

Start / End Page

2152 / 2162

Related Subject Headings

  • Organic Chemistry
  • 3405 Organic chemistry
  • 0305 Organic Chemistry
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Fralish, Z., & Reker, D. (2024). Finding the most potent compounds using active learning on molecular pairs. Beilstein Journal of Organic Chemistry, 20, 2152–2162. https://doi.org/10.3762/bjoc.20.185
Fralish, Zachary, and Daniel Reker. “Finding the most potent compounds using active learning on molecular pairs.Beilstein Journal of Organic Chemistry 20 (January 2024): 2152–62. https://doi.org/10.3762/bjoc.20.185.
Fralish Z, Reker D. Finding the most potent compounds using active learning on molecular pairs. Beilstein journal of organic chemistry. 2024 Jan;20:2152–62.
Fralish, Zachary, and Daniel Reker. “Finding the most potent compounds using active learning on molecular pairs.Beilstein Journal of Organic Chemistry, vol. 20, Jan. 2024, pp. 2152–62. Epmc, doi:10.3762/bjoc.20.185.
Fralish Z, Reker D. Finding the most potent compounds using active learning on molecular pairs. Beilstein journal of organic chemistry. 2024 Jan;20:2152–2162.

Published In

Beilstein journal of organic chemistry

DOI

EISSN

1860-5397

ISSN

1860-5397

Publication Date

January 2024

Volume

20

Start / End Page

2152 / 2162

Related Subject Headings

  • Organic Chemistry
  • 3405 Organic chemistry
  • 0305 Organic Chemistry