Scholars@Duke publication: Combining synthetic data with subsampling to create public use microdata files for large scale surveys

Combining synthetic data with subsampling to create public use microdata files for large scale surveys

Publication , Journal Article

Drechsler, J; Reiter, JP

Published in: Survey Methodology

June 1, 2012

To create public use files from large scale surveys, statistical agencies sometimes release random subsamples of the original records. Random subsampling reduces file sizes for secondary data analysts and reduces risks of unintended disclosures of survey participants' confidential information. However, subsampling does not eliminate risks, so that alteration of the data is needed before dissemination. We propose to create disclosure-protected subsamples from large scale surveys based on multiple imputation. The idea is to replace identifying or sensitive values in the original sample with draws from statistical models, and release subsamples of the disclosure-protected data. We present methods for making inferences with the multiple synthetic subsamples.

Duke Scholars

Author Jerome P. Reiter Statistical Science

Published In

Survey Methodology

EISSN

1492-0921

ISSN

0714-0045

Publication Date

June 1, 2012

Volume

38

Issue

1

Start / End Page

73 / 79

Related Subject Headings

Statistics & Probability
4905 Statistics
0104 Statistics

Citation

APA

Chicago

ICMJE

MLA

NLM

Drechsler, J., & Reiter, J. P. (2012). Combining synthetic data with subsampling to create public use microdata files for large scale surveys. Survey Methodology, 38(1), 73–79.

Drechsler, J., and J. P. Reiter. “Combining synthetic data with subsampling to create public use microdata files for large scale surveys.” Survey Methodology 38, no. 1 (June 1, 2012): 73–79.

Drechsler J, Reiter JP. Combining synthetic data with subsampling to create public use microdata files for large scale surveys. Survey Methodology. 2012 Jun 1;38(1):73–9.

Drechsler, J., and J. P. Reiter. “Combining synthetic data with subsampling to create public use microdata files for large scale surveys.” Survey Methodology, vol. 38, no. 1, June 2012, pp. 73–79.

Drechsler J, Reiter JP. Combining synthetic data with subsampling to create public use microdata files for large scale surveys. Survey Methodology. 2012 Jun 1;38(1):73–79.

Published In

Survey Methodology

EISSN

1492-0921

ISSN

0714-0045

Publication Date

June 1, 2012

Volume

38

Issue

1

Start / End Page

73 / 79

Related Subject Headings

Statistics & Probability
4905 Statistics
0104 Statistics