Scholars@Duke publication: Implications of data topology for deep generative models

Implications of data topology for deep generative models

Publication , Journal Article

Jin, Y; McDaniel, R; Tatro, NJ; Catanzaro, MJ; Smith, AD; Bendich, P; Dwyer, MB; Fletcher, PT

Published in: Frontiers in Computer Science

January 1, 2024

Many deep generative models, such as variational autoencoders (VAEs) and generative adversarial networks (GANs), learn an immersion mapping from a standard normal distribution in a low-dimensional latent space into a higher-dimensional data space. As such, these mappings are only capable of producing simple data topologies, i.e., those equivalent to an immersion of Euclidean space. In this work, we demonstrate the limitations of such latent space generative models when trained on data distributions with non-trivial topologies. We do this by training these models on synthetic image datasets with known topologies (spheres, torii, etc.). We then show how this results in failures of both data generation as well as data interpolation. Next, we compare this behavior to two classes of deep generative models that in principle allow for more complex data topologies. First, we look at chart autoencoders (CAEs), which construct a smooth data manifold from multiple latent space chart mappings. Second, we explore score-based models, e.g., denoising diffusion probabilistic models, which estimate gradients of the data distribution without resorting to an explicit mapping to a latent space. Our results show that these models do demonstrate improved ability over latent space models in modeling data distributions with complex topologies, however, challenges still remain.

Duke Scholars

Author Paul L Bendich Mathematics

Published In

Frontiers in Computer Science

DOI

10.3389/fcomp.2024.1260604

EISSN

2624-9898

Publication Date

January 1, 2024

Volume

Related Subject Headings

46 Information and computing sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Jin, Y., McDaniel, R., Tatro, N. J., Catanzaro, M. J., Smith, A. D., Bendich, P., … Fletcher, P. T. (2024). Implications of data topology for deep generative models. Frontiers in Computer Science, 6. https://doi.org/10.3389/fcomp.2024.1260604

Jin, Y., R. McDaniel, N. J. Tatro, M. J. Catanzaro, A. D. Smith, P. Bendich, M. B. Dwyer, and P. T. Fletcher. “Implications of data topology for deep generative models.” Frontiers in Computer Science 6 (January 1, 2024). https://doi.org/10.3389/fcomp.2024.1260604.

Jin Y, McDaniel R, Tatro NJ, Catanzaro MJ, Smith AD, Bendich P, et al. Implications of data topology for deep generative models. Frontiers in Computer Science. 2024 Jan 1;6.

Jin, Y., et al. “Implications of data topology for deep generative models.” Frontiers in Computer Science, vol. 6, Jan. 2024. Scopus, doi:10.3389/fcomp.2024.1260604.

Jin Y, McDaniel R, Tatro NJ, Catanzaro MJ, Smith AD, Bendich P, Dwyer MB, Fletcher PT. Implications of data topology for deep generative models. Frontiers in Computer Science. 2024 Jan 1;6.

Published In

Frontiers in Computer Science

DOI

10.3389/fcomp.2024.1260604

EISSN

2624-9898

Publication Date

January 1, 2024

Volume

Related Subject Headings

46 Information and computing sciences