Building a drug development database: challenges in reliable data availability.

Journal Article (Journal Article)


Policy and legislative efforts to improve the biomedical innovation process must rely on a detailed and thorough analysis of drug development and industry output.


As part of our efforts to build a publicly-available database on the characteristics of drug development, we present work undertaken to test methods for compiling data from public sources. These initial steps are designed to explore challenges in data extraction, completeness and reliability. Specifically, filing dates for Investigational New Drugs (IND) applications with the U.S. Food and Drug Administration (FDA) were chosen as the initial objective data element to be collected.

Materials and methods

FDA's Drugs@FDA database and the Federal Register (FR) were used to collect IND dates for the 587 New Molecular Entities (NMEs) approved between 1994 and 2014. When available, the following data were captured: approval date, IND number, IND date and source of information.


At least one IND date was available for 445 (75.8%) of the 587 NMEs. The Drugs@FDA database provided IND dates for 303 (51.6%) NMEs and the FR contributed with 297 (50.6%) IND dates. Out of the 445 NMEs for which an IND date was obtained, 274 (61.6%) had more than one date reported.


Key finding of this paper is a considerable inconsistency in reliably available or reported data elements, in this particular case, IND application filing dates as assembled from publicly-available sources.


Our team will continue to focus on finding ways to collect relevant information to measure impact of drug innovation.

Full Text

Duke Authors

Cited Authors

  • Audibert, C; Romine, M; Caze, A; Daniel, G; Leff, J; McClellan, M

Published Date

  • January 2017

Published In

Volume / Issue

  • 43 / 1

Start / End Page

  • 74 - 78

PubMed ID

  • 27494335

Electronic International Standard Serial Number (EISSN)

  • 1520-5762

International Standard Serial Number (ISSN)

  • 0363-9045

Digital Object Identifier (DOI)

  • 10.1080/03639045.2016.1220565


  • eng