An evaluation of factors influencing Bayesian learning systems.
OBJECTIVES: To examine the influences of situational and model factors on the accuracy of Bayesian learning systems. DESIGN: This study examines the impacts of variations in two situational factors, training sample size and number of attributes, and in two model factors, choice of Bayesian model and criteria for excluding model attributes, on the overall accuracy of Bayesian learning systems. MEASUREMENTS: The test data were derived from myocardial infarction patients who were admitted to eight hospitals in New Orleans during 1985. The test sample consisted of 339 cases; the training samples included 100, 400, and 800 cases. APACHE II variables were used for the model attributes and patient discharge status as the outcome predicted. Attribute sets were selected in sizes of 4, 8, and 12. The authors varied the Bayesian models (proper and simple) and the attribute exclusion criteria (optimism and pessimism). RESULTS: The simple Bayes model, which assumes conditional independence, consistently equalled or outperformed the proper (maximally dependent) Bayes model, which assumes conditional dependence, across all training sample and attribute set sizes. Not excluding model attributes was found to be preferable to using sample theory as an attribute exclusion criterion in both the simple and the proper models. CONCLUSION: In the domain tested, the simple Bayes model with optimistic exclusion is more robust than previously assumed and increasing the number of attributes in a model had a greater relative impact on model accuracy than did increasing the number of training sample cases. Assessment of applicability of these findings to other domains will require further study. In addition, other models that are between these two extremes must be investigated. These include models that approximate proper Bayes' conditional dependence computations while requiring fewer training sample cases, attribute exclusion criteria between optimism and pessimism that improve accuracy, and ordering techniques for introducing attributes into Bayes models that optimize the information value associated with the attributes in test-sample cases.
Volume / Issue
Start / End Page
Pubmed Central ID
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)