Bayesian multiple imputation for large-scale categorical data with structural zeros


Journal Article

© Minister of Industry, 2014. We propose an approach for multiple imputation of items missing at random in large-scale surveys with exclusively categorical variables that have structural zeros. Our approach is to use mixtures of multinomial distributions as imputation engines, accounting for structural zeros by conceiving of the observed data as a truncated sample from a hypothetical population without structural zeros. This approach has several appealing features: imputations are generated from coherent, Bayesian joint models that automatically capture complex dependencies and readily scale to large numbers of variables. We outline a Gibbs sampling algorithm for implementing the approach, and we illustrate its potential with a repeated sampling study using public use census microdata from the state of New York, U.S.A.

Duke Authors

Cited Authors

  • Manrique-Vallier, D; Reiter, JP

Published Date

  • January 1, 2014

Published In

Volume / Issue

  • 40 / 1

Start / End Page

  • 125 - 134

Electronic International Standard Serial Number (EISSN)

  • 1492-0921

International Standard Serial Number (ISSN)

  • 0714-0045

Citation Source

  • Scopus