Skip to main content

Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality

Publication ,  Journal Article
Reiter, JP; Drechsler, J
Published in: Statistica Sinica
January 1, 2010

To protect the confidentiality of survey respondents' identities and sensitive attributes, statistical agencies can release data in which confidential values are replaced with multiple imputations. These are called synthetic data. We propose a two-stage approach to generating synthetic data that enables agencies to release different numbers of imputations for different variables. Generation in two stages can reduce computational burdens, decrease disclosure risk, and increase inferential accuracy relative to generation in one stage. We present methods for obtaining inferences from such data. We describe the application of two stage synthesis to creating a public use file for a German business database.

Duke Scholars

Published In

Statistica Sinica

ISSN

1017-0405

Publication Date

January 1, 2010

Volume

20

Issue

1

Start / End Page

405 / 421

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 0801 Artificial Intelligence and Image Processing
  • 0199 Other Mathematical Sciences
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Reiter, J. P., & Drechsler, J. (2010). Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality. Statistica Sinica, 20(1), 405–421.
Reiter, J. P., and J. Drechsler. “Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality.” Statistica Sinica 20, no. 1 (January 1, 2010): 405–21.
Reiter JP, Drechsler J. Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality. Statistica Sinica. 2010 Jan 1;20(1):405–21.
Reiter, J. P., and J. Drechsler. “Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality.” Statistica Sinica, vol. 20, no. 1, Jan. 2010, pp. 405–21.
Reiter JP, Drechsler J. Releasing multiply-imputed synthetic data generated in two stages to protect confidentiality. Statistica Sinica. 2010 Jan 1;20(1):405–421.

Published In

Statistica Sinica

ISSN

1017-0405

Publication Date

January 1, 2010

Volume

20

Issue

1

Start / End Page

405 / 421

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 0801 Artificial Intelligence and Image Processing
  • 0199 Other Mathematical Sciences
  • 0104 Statistics