Skip to main content
Journal cover image

Synthesizing categorical datasets to enhance inference

Publication ,  Journal Article
Berrocal, VJ; Miranda, ML; Gelfand, AE; Bhattacharya, S
Published in: Statistical Methodology
November 1, 2013

A common data analysis setting consists of a collection of datasets of varying sizes that are all relevant to a particular scientific question, but which include different subsets of the relevant variables, presumably with some overlap. Here, we demonstrate that synthesizing cross-classified categorical datasets drawn from an incompletely cross-classified common population, where many of the sets are incomplete (i.e., one or more of the classification variables is unobserved), but at least one is completely observed is expected to reduce uncertainty about the cell probabilities in the associated multi-way contingency table as well as for derived quantities such as relative risks and odds ratios. The use of the word "expected" here is the key point. When synthesizing complete datasets from a common population, we are assured to reduce uncertainty. However, when we work with a log-linear model to explain the complete table, because this model cannot be fitted to any of the incomplete datasets, improvement is not assured. We provide technical clarification of this point as well as a series of simulation examples, motivated by an adverse birth outcomes investigation, to illustrate what can be expected under such synthesis. © 2013.

Duke Scholars

Published In

Statistical Methodology

DOI

ISSN

1572-3127

Publication Date

November 1, 2013

Volume

15

Start / End Page

25 / 45

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Berrocal, V. J., Miranda, M. L., Gelfand, A. E., & Bhattacharya, S. (2013). Synthesizing categorical datasets to enhance inference. Statistical Methodology, 15, 25–45. https://doi.org/10.1016/j.stamet.2013.04.001
Berrocal, V. J., M. L. Miranda, A. E. Gelfand, and S. Bhattacharya. “Synthesizing categorical datasets to enhance inference.” Statistical Methodology 15 (November 1, 2013): 25–45. https://doi.org/10.1016/j.stamet.2013.04.001.
Berrocal VJ, Miranda ML, Gelfand AE, Bhattacharya S. Synthesizing categorical datasets to enhance inference. Statistical Methodology. 2013 Nov 1;15:25–45.
Berrocal, V. J., et al. “Synthesizing categorical datasets to enhance inference.” Statistical Methodology, vol. 15, Nov. 2013, pp. 25–45. Scopus, doi:10.1016/j.stamet.2013.04.001.
Berrocal VJ, Miranda ML, Gelfand AE, Bhattacharya S. Synthesizing categorical datasets to enhance inference. Statistical Methodology. 2013 Nov 1;15:25–45.
Journal cover image

Published In

Statistical Methodology

DOI

ISSN

1572-3127

Publication Date

November 1, 2013

Volume

15

Start / End Page

25 / 45

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 0104 Statistics