Data mining methods find demographic predictors of preterm birth.

Journal Article (Journal Article)

BACKGROUND: Preterm births in the United States increased from 11.0% to 11.4% between 1996 and 1997; they continue to be a complex healthcare problem in the United States. OBJECTIVE: The objective of this research was to compare traditional statistical methods with emerging new methods called data mining or knowledge discovery in databases in identifying accurate predictors of preterm births. METHOD: An ethnically diverse sample (N = 19,970) of pregnant women provided data (1,622 variables) for new methods of analysis. Preterm birth predictors were evaluated using traditional statistical and newer data mining analyses. RESULTS: Seven demographic variables (maternal age and binary coding for county of residence, education, marital status, payer source, race, and religion) yielded a .72 area under the curve using Receiving Operating Characteristic curves to test predictive accuracy. The addition of hundreds of other variables added only a .03 to the area under the curve. CONCLUSION: Similar results across data mining methods suggest that results are data-driven and not method-dependent, and that demographic variables offer a small set of parsimonious variables with reasonable accuracy in predicting preterm birth outcomes in a racially diverse population.

Full Text

Duke Authors

Cited Authors

  • Goodwin, LK; Iannacchione, MA; Hammond, WE; Crockett, P; Maher, S; Schlitz, K

Published Date

  • November 2001

Published In

Volume / Issue

  • 50 / 6

Start / End Page

  • 340 - 345

PubMed ID

  • 11725935

International Standard Serial Number (ISSN)

  • 0029-6562

Digital Object Identifier (DOI)

  • 10.1097/00006199-200111000-00003


  • eng

Conference Location

  • United States