Scholars@Duke publication: Selection of Informative Examples in Chemogenomic Datasets.

Selection of Informative Examples in Chemogenomic Datasets.

Publication , Chapter

Reker, D; Brown, JB

January 2018

High-throughput and high-content screening campaigns have resulted in the creation of large chemogenomic matrices. These matrices form the training data which is used to build ligand-target interaction models for pharmacological and chemical biology research. While academic, government, and industrial efforts continuously add to the ligand-target data pairs available for modeling, major research efforts are devoted to improving machine learning techniques to cope with the sparseness, heterogeneity, and size of available datasets as well as inherent noise and bias. This "race of arms" has led to the creation of algorithms to generate highly complex models with high prediction performance at the cost of training efficiency as well as interpretability.In contrast, recent studies have challenged the necessity for "big data" in chemogenomic modeling and found that models built on larger numbers of examples do not necessarily result in better predictive abilities. Automated adaptive selection of the training data (ligand-target instances) used for model creation can result in considerably smaller training sets that retain prediction performance on par with training using hundreds of thousands of data points. In this chapter, we describe the protocols used for one such iterative chemogenomic selection technique, including model construction and update as well as possible techniques for evaluations of constructed models and analysis of the iterative model construction.

Duke Scholars

Author Daniel Reker Biomedical Engineering

DOI

10.1007/978-1-4939-8639-2_13

Publication Date

January 2018

Volume

1825

Start / End Page

369 / 410

Related Subject Headings

Pharmaceutical Preparations
Machine Learning
Humans
Drug Discovery
Developmental Biology
Datasets as Topic
Databases, Chemical
Data Mining
Computer Simulation
3404 Medicinal and biomolecular chemistry

Citation

APA

Chicago

ICMJE

MLA

NLM

Reker, D., & Brown, J. B. (2018). Selection of Informative Examples in Chemogenomic Datasets. (Vol. 1825, pp. 369–410). https://doi.org/10.1007/978-1-4939-8639-2_13

Reker, Daniel, and J. B. Brown. “Selection of Informative Examples in Chemogenomic Datasets.,” 1825:369–410, 2018. https://doi.org/10.1007/978-1-4939-8639-2_13.

Reker D, Brown JB. Selection of Informative Examples in Chemogenomic Datasets. In 2018. p. 369–410.

Reker, Daniel, and J. B. Brown. Selection of Informative Examples in Chemogenomic Datasets. Vol. 1825, 2018, pp. 369–410. Epmc, doi:10.1007/978-1-4939-8639-2_13.

Reker D, Brown JB. Selection of Informative Examples in Chemogenomic Datasets. 2018. p. 369–410.

DOI

10.1007/978-1-4939-8639-2_13

Publication Date

January 2018

Volume

1825

Start / End Page

369 / 410

Related Subject Headings

Pharmaceutical Preparations
Machine Learning
Humans
Drug Discovery
Developmental Biology
Datasets as Topic
Databases, Chemical
Data Mining
Computer Simulation
3404 Medicinal and biomolecular chemistry