Scholars@Duke publication: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Publication , Journal Article

Mazurowski, MA; Habas, PA; Zurada, JM; Lo, JY; Baker, JA; Tourassi, GD

Published in: Neural Netw

2008

This study investigates the effect of class imbalance in training data when developing neural network classifiers for computer-aided medical diagnosis. The investigation is performed in the presence of other characteristics that are typical among medical data, namely small training sample size, large number of features, and correlations between features. Two methods of neural network training are explored: classical backpropagation (BP) and particle swarm optimization (PSO) with clinically relevant training criteria. An experimental study is performed using simulated data and the conclusions are further validated on real clinical data for breast cancer diagnosis. The results show that classifier performance deteriorates with even modest class imbalance in the training data. Further, it is shown that BP is generally preferable over PSO for imbalanced training data especially with small data sample and large number of features. Finally, it is shown that there is no clear preference between oversampling and no compensation approach and some guidance is provided regarding a proper selection.

Duke Scholars

Author Maciej A Mazurowski Biostatistics & Bioinformatics, Division of Translational Bi ...

Author Joseph Yuan-Chieh Lo Radiology

Author Jay Alan Baker Radiology, Breast Imaging

Published In

Neural Netw

DOI

10.1016/j.neunet.2007.12.031

ISSN

0893-6080

Publication Date

2008

Volume

Issue

2-3

Start / End Page

427 / 436

Location

United States

Related Subject Headings

ROC Curve
Neural Networks, Computer
Humans
Feedback
Electronic Data Processing
Diagnosis, Computer-Assisted
Decision Making
Computer Simulation
Breast Neoplasms
Artificial Intelligence & Image Processing

Citation

APA

Chicago

ICMJE

MLA

NLM

Mazurowski, M. A., Habas, P. A., Zurada, J. M., Lo, J. Y., Baker, J. A., & Tourassi, G. D. (2008). Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw, 21(2–3), 427–436. https://doi.org/10.1016/j.neunet.2007.12.031

Mazurowski, Maciej A., Piotr A. Habas, Jacek M. Zurada, Joseph Y. Lo, Jay A. Baker, and Georgia D. Tourassi. “Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.” Neural Netw 21, no. 2–3 (2008): 427–36. https://doi.org/10.1016/j.neunet.2007.12.031.

Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD. Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 2008;21(2–3):427–36.

Mazurowski, Maciej A., et al. “Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.” Neural Netw, vol. 21, no. 2–3, 2008, pp. 427–36. Pubmed, doi:10.1016/j.neunet.2007.12.031.

Published In

Neural Netw

DOI

10.1016/j.neunet.2007.12.031

ISSN

0893-6080

Publication Date

2008

Volume

Issue

2-3

Start / End Page

427 / 436

Location

United States

Related Subject Headings

ROC Curve
Neural Networks, Computer
Humans
Feedback
Electronic Data Processing
Diagnosis, Computer-Assisted
Decision Making
Computer Simulation
Breast Neoplasms
Artificial Intelligence & Image Processing