Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys
In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian, joint modeling approach to multiple imputation for categorical data based on Dirichlet process mixtures of multinomial distributions. The approach automatically models complex dependencies while being computationally expedient. The Dirichlet process prior distributions enable analysts to avoid fixing the number of mixture components at an arbitrary number. We illustrate repeated sampling properties of the approach using simulated data. We apply the methodology to impute missing background data in the 2007 Trends in International Mathematics and Science Study. © 2013 AERA.
Duke Scholars
Altmetric Attention Stats
Dimensions Citation Stats
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Social Sciences Methods
- 4905 Statistics
- 3904 Specialist studies in education
- 1701 Psychology
- 1303 Specialist Studies in Education
- 0104 Statistics
Citation
Published In
DOI
ISSN
Publication Date
Volume
Issue
Start / End Page
Related Subject Headings
- Social Sciences Methods
- 4905 Statistics
- 3904 Specialist studies in education
- 1701 Psychology
- 1303 Specialist Studies in Education
- 0104 Statistics