Skip to main content

Selection of Informative Examples in Chemogenomic Datasets.

Publication ,  Chapter
Reker, D; Brown, JB
January 2018

High-throughput and high-content screening campaigns have resulted in the creation of large chemogenomic matrices. These matrices form the training data which is used to build ligand-target interaction models for pharmacological and chemical biology research. While academic, government, and industrial efforts continuously add to the ligand-target data pairs available for modeling, major research efforts are devoted to improving machine learning techniques to cope with the sparseness, heterogeneity, and size of available datasets as well as inherent noise and bias. This "race of arms" has led to the creation of algorithms to generate highly complex models with high prediction performance at the cost of training efficiency as well as interpretability.In contrast, recent studies have challenged the necessity for "big data" in chemogenomic modeling and found that models built on larger numbers of examples do not necessarily result in better predictive abilities. Automated adaptive selection of the training data (ligand-target instances) used for model creation can result in considerably smaller training sets that retain prediction performance on par with training using hundreds of thousands of data points. In this chapter, we describe the protocols used for one such iterative chemogenomic selection technique, including model construction and update as well as possible techniques for evaluations of constructed models and analysis of the iterative model construction.

Duke Scholars

DOI

Publication Date

January 2018

Volume

1825

Start / End Page

369 / 410

Related Subject Headings

  • Pharmaceutical Preparations
  • Machine Learning
  • Humans
  • Drug Discovery
  • Developmental Biology
  • Datasets as Topic
  • Databases, Chemical
  • Data Mining
  • Computer Simulation
  • 3404 Medicinal and biomolecular chemistry
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Reker, D., & Brown, J. B. (2018). Selection of Informative Examples in Chemogenomic Datasets. (Vol. 1825, pp. 369–410). https://doi.org/10.1007/978-1-4939-8639-2_13
Reker, Daniel, and J. B. Brown. “Selection of Informative Examples in Chemogenomic Datasets.,” 1825:369–410, 2018. https://doi.org/10.1007/978-1-4939-8639-2_13.
Reker D, Brown JB. Selection of Informative Examples in Chemogenomic Datasets. In 2018. p. 369–410.
Reker, Daniel, and J. B. Brown. Selection of Informative Examples in Chemogenomic Datasets. Vol. 1825, 2018, pp. 369–410. Epmc, doi:10.1007/978-1-4939-8639-2_13.

DOI

Publication Date

January 2018

Volume

1825

Start / End Page

369 / 410

Related Subject Headings

  • Pharmaceutical Preparations
  • Machine Learning
  • Humans
  • Drug Discovery
  • Developmental Biology
  • Datasets as Topic
  • Databases, Chemical
  • Data Mining
  • Computer Simulation
  • 3404 Medicinal and biomolecular chemistry