Skip to main content
Journal cover image

Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations.

Publication ,  Journal Article
Robson, B; Li, J; Dettinger, R; Peters, A; Boyer, SK
Published in: Journal of computer-aided molecular design
May 2011

A patent data base of 6.7 million compounds generated by a very high performance computer (Blue Gene) requires new techniques for exploitation when extensive use of chemical similarity is involved. Such exploitation includes the taxonomic classification of chemical themes, and data mining to assess mutual information between themes and companies. Importantly, we also launch candidates that evolve by "natural selection" as failure of partial match against the patent data base and their ability to bind to the protein target appropriately, by simulation on Blue Gene. An unusual feature of our method is that algorithms and workflows rely on dynamic interaction between match-and-edit instructions, which in practice are regular expressions. Similarity testing by these uses SMILES strings and, less frequently, graph or connectivity representations. Examining how this performs in high throughput, we note that chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept.

Duke Scholars

Published In

Journal of computer-aided molecular design

DOI

EISSN

1573-4951

ISSN

0920-654X

Publication Date

May 2011

Volume

25

Issue

5

Start / End Page

427 / 441

Related Subject Headings

  • Small Molecule Libraries
  • Pattern Recognition, Automated
  • Patents as Topic
  • Medicinal & Biomolecular Chemistry
  • Information Storage and Retrieval
  • Image Interpretation, Computer-Assisted
  • Humans
  • Drug Discovery
  • Databases, Factual
  • Data Interpretation, Statistical
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Robson, B., Li, J., Dettinger, R., Peters, A., & Boyer, S. K. (2011). Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations. Journal of Computer-Aided Molecular Design, 25(5), 427–441. https://doi.org/10.1007/s10822-011-9429-x
Robson, Barry, Jin Li, Richard Dettinger, Amanda Peters, and Stephen K. Boyer. “Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations.Journal of Computer-Aided Molecular Design 25, no. 5 (May 2011): 427–41. https://doi.org/10.1007/s10822-011-9429-x.
Robson B, Li J, Dettinger R, Peters A, Boyer SK. Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations. Journal of computer-aided molecular design. 2011 May;25(5):427–41.
Robson, Barry, et al. “Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations.Journal of Computer-Aided Molecular Design, vol. 25, no. 5, May 2011, pp. 427–41. Epmc, doi:10.1007/s10822-011-9429-x.
Robson B, Li J, Dettinger R, Peters A, Boyer SK. Drug discovery using very large numbers of patents: general strategy with extensive use of match and edit operations. Journal of computer-aided molecular design. 2011 May;25(5):427–441.
Journal cover image

Published In

Journal of computer-aided molecular design

DOI

EISSN

1573-4951

ISSN

0920-654X

Publication Date

May 2011

Volume

25

Issue

5

Start / End Page

427 / 441

Related Subject Headings

  • Small Molecule Libraries
  • Pattern Recognition, Automated
  • Patents as Topic
  • Medicinal & Biomolecular Chemistry
  • Information Storage and Retrieval
  • Image Interpretation, Computer-Assisted
  • Humans
  • Drug Discovery
  • Databases, Factual
  • Data Interpretation, Statistical