Classification via bayesian nonparametric learning of affine subspaces
It has become common for datasets to contain large numbers of variables in studies conducted in areas such as genetics, machine vision, image analysis, and many others. When analyzing such data, parametric models are often too inflexible while nonparametric procedures tend to be nonrobust because of insufficient data on these high-dimensional spaces. This is particularly true when interest lies in building efficient classifiers in the presence of many predictor variables. When dealing with these types of data, it is often the case that most of the variability tends to lie along a few directions, or more generally along a much smaller dimensional submanifold of the data space. In this article, we propose a class of models that flexibly learn about this submanifold while simultaneously performing dimension reduction in classification. This methodology allows the cell probabilities to vary nonparametrically based on a few coordinates expressed as linear combinations of the predictors. Also, as opposed to many black-box methods for dimensionality reduction, the proposed model is appealing in having clearly interpretable and identifiable parameters that provide insight into which predictors are important in determining accurate classification boundaries. Gibbs sampling methods are developed for posterior computation, and the methods are illustrated using simulated and real data applications. © 2013 American Statistical Association.
Page, G; Bhattacharya, A; Dunson, D
Volume / Issue
Start / End Page
Electronic International Standard Serial Number (EISSN)
International Standard Serial Number (ISSN)
Digital Object Identifier (DOI)