Skip to main content

PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING

Publication ,  Journal Article
Smith, AD; Bendich, P; Harer, J
Published in: Transactions of the American Mathematical Society Series B
February 2, 2021

Collections of measures on compact metric spaces form a model category (“data complexes”), whose morphisms are marginalization integrals. The fibrant objects in this category represent collections of measures in which there is a measure on a product space that marginalizes to any measures on pairs of its factors. The homotopy and homology for this category allow measurement of obstructions to finding measures on larger and larger product spaces. The obstruction theory is compatible with a fibrant filtration built from the Wasserstein distance on measures. Despite the abstract tools, this is motivated by a widespread problem in data science. Data complexes provide a mathematical foundation for semi-automated data-alignment tools that are common in commercial database software. Practically speaking, the theory shows that database JOIN operations are subject to genuine topological obstructions. Those obstructions can be detected by an obstruction cocycle and can be resolved by moving through a filtration. Thus, any collection of databases has a persistence level, which measures the difficulty of JOINing those databases. Because of its general formulation, this persistent obstruction theory also encompasses multi-modal data fusion problems, some forms of Bayesian inference, and probability couplings.

Duke Scholars

Published In

Transactions of the American Mathematical Society Series B

DOI

EISSN

2330-0000

Publication Date

February 2, 2021

Volume

8

Issue

1

Start / End Page

1 / 38
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Smith, A. D., Bendich, P., & Harer, J. (2021). PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING. Transactions of the American Mathematical Society Series B, 8(1), 1–38. https://doi.org/10.1090/btran/56
Smith, A. D., P. Bendich, and J. Harer. “PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING.” Transactions of the American Mathematical Society Series B 8, no. 1 (February 2, 2021): 1–38. https://doi.org/10.1090/btran/56.
Smith AD, Bendich P, Harer J. PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING. Transactions of the American Mathematical Society Series B. 2021 Feb 2;8(1):1–38.
Smith, A. D., et al. “PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING.” Transactions of the American Mathematical Society Series B, vol. 8, no. 1, Feb. 2021, pp. 1–38. Scopus, doi:10.1090/btran/56.
Smith AD, Bendich P, Harer J. PERSISTENT OBSTRUCTION THEORY FOR A MODEL CATEGORY OF MEASURES WITH APPLICATIONS TO DATA MERGING. Transactions of the American Mathematical Society Series B. 2021 Feb 2;8(1):1–38.

Published In

Transactions of the American Mathematical Society Series B

DOI

EISSN

2330-0000

Publication Date

February 2, 2021

Volume

8

Issue

1

Start / End Page

1 / 38