Scaffoldings and Spines: Organizing High-Dimensional Data Using Cover Trees, Local Principal Component Analysis, and Persistent Homology

Published

Journal Article (Chapter)

© 2018, The Author(s) and the Association for Women in Mathematics. We propose a flexible and multi-scale method for organizing, visualizing, and understanding point cloud datasets sampled from or near stratified spaces. The first part of the algorithm produces a cover tree for a dataset using an adaptive threshold that is based on multi-scale local principal component analysis. The resulting cover tree nodes reflect the local geometry of the space and are organized via a scaffolding graph. In the second part of the algorithm, the goals are to uncover the strata that make up the underlying stratified space using a local dimension estimation procedure and topological data analysis, as well as to ultimately visualize the results in a simplified spine graph. We demonstrate our technique on several synthetic examples and then use it to visualize song structure in musical audio data.

Full Text

Duke Authors

Cited Authors

  • Bendich, P; Gasparovic, E; Harer, J; Tralie, CJ

Published Date

  • January 1, 2018

Volume / Issue

  • 13 /

Start / End Page

  • 93 - 114

Electronic International Standard Serial Number (EISSN)

  • 2364-5741

International Standard Serial Number (ISSN)

  • 2364-5733

Digital Object Identifier (DOI)

  • 10.1007/978-3-319-89593-2_6

Citation Source

  • Scopus