Skip to main content

Categorical data fusion using auxiliary information

Publication ,  Journal Article
Fosdick, BK; Deyoreo, M; Reiter, JP
Published in: Annals of Applied Statistics
December 1, 2016

In data fusion, analysts seek to combine information from two databases comprised of disjoint sets of individuals, in which some variables appear in both databases and other variables appear in only one database. Most data fusion techniques rely on variants of conditional independence assumptions. When inappropriate, these assumptions can result in unreliable inferences. We propose a data fusion technique that allows analysts to easily incorporate auxiliary information on the dependence structure of variables not observed jointly; we refer to this auxiliary information as glue. With this technique, we fuse two marketing surveys from the book publisher HarperCollins using glue from the online, rapid-response polling company CivicScience. The fused data enable estimation of associations between people’s preferences for authors and for learning about new books. The analysis also serves as a case study on the potential for using online surveys to aid data fusion.

Duke Scholars

Published In

Annals of Applied Statistics

DOI

EISSN

1941-7330

ISSN

1932-6157

Publication Date

December 1, 2016

Volume

10

Issue

4

Start / End Page

1907 / 1929

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Fosdick, B. K., Deyoreo, M., & Reiter, J. P. (2016). Categorical data fusion using auxiliary information. Annals of Applied Statistics, 10(4), 1907–1929. https://doi.org/10.1214/16-AOAS925
Fosdick, B. K., M. Deyoreo, and J. P. Reiter. “Categorical data fusion using auxiliary information.” Annals of Applied Statistics 10, no. 4 (December 1, 2016): 1907–29. https://doi.org/10.1214/16-AOAS925.
Fosdick BK, Deyoreo M, Reiter JP. Categorical data fusion using auxiliary information. Annals of Applied Statistics. 2016 Dec 1;10(4):1907–29.
Fosdick, B. K., et al. “Categorical data fusion using auxiliary information.” Annals of Applied Statistics, vol. 10, no. 4, Dec. 2016, pp. 1907–29. Scopus, doi:10.1214/16-AOAS925.
Fosdick BK, Deyoreo M, Reiter JP. Categorical data fusion using auxiliary information. Annals of Applied Statistics. 2016 Dec 1;10(4):1907–1929.

Published In

Annals of Applied Statistics

DOI

EISSN

1941-7330

ISSN

1932-6157

Publication Date

December 1, 2016

Volume

10

Issue

4

Start / End Page

1907 / 1929

Related Subject Headings

  • Statistics & Probability
  • 4905 Statistics
  • 1403 Econometrics
  • 0104 Statistics