Skip to main content

Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering

Publication ,  Journal Article
Pitsianis, N; Floros, D; Iliopoulos, A-S; Mylonakis, K; Sismanis, N; Sun, X
September 11, 2017

Calculation of near-neighbor interactions among high dimensional, irregularly distributed data points is a fundamental task to many graph-based or kernel-based machine learning algorithms and applications. Such calculations, involving large, sparse interaction matrices, expose the limitation of conventional data-and-computation reordering techniques for improving space and time locality on modern computer memory hierarchies. We introduce a novel method for obtaining a matrix permutation that renders a desirable sparsity profile. The method is distinguished by the guiding principle to obtain a profile that is block-sparse with dense blocks. Our profile model and measure capture the essential properties affecting space and time locality, and permit variation in sparsity profile without imposing a restriction to a fixed pattern. The second distinction lies in an efficient algorithm for obtaining a desirable profile, via exploring and exploiting multi-scale cluster structure hidden in but intrinsic to the data. The algorithm accomplishes its task with key components for lower-dimensional embedding with data-specific principal feature axes, hierarchical data clustering, multi-level matrix compression storage, and multi-level interaction computations. We provide experimental results from case studies with two important data analysis algorithms. The resulting performance is remarkably comparable to the BLAS performance for the best-case interaction governed by a regularly banded matrix with the same sparsity.

Duke Scholars

Publication Date

September 11, 2017
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Pitsianis, N., Floros, D., Iliopoulos, A.-S., Mylonakis, K., Sismanis, N., & Sun, X. (2017). Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering.
Pitsianis, Nikos, Dimitris Floros, Alexandros-Stavros Iliopoulos, Kostas Mylonakis, Nikos Sismanis, and Xiaobai Sun. “Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering,” September 11, 2017.
Pitsianis N, Floros D, Iliopoulos A-S, Mylonakis K, Sismanis N, Sun X. Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering. 2017 Sep 11;
Pitsianis N, Floros D, Iliopoulos A-S, Mylonakis K, Sismanis N, Sun X. Rapid Near-Neighbor Interaction of High-dimensional Data via Hierarchical Clustering. 2017 Sep 11;

Publication Date

September 11, 2017