Scholars@Duke publication: A fast algorithm for subspace clustering by pattern similarity

A fast algorithm for subspace clustering by pattern similarity

Publication , Conference

Wang, H; Chu, F; Fan, W; Philip, SY; Pei, J

Published in: Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm

January 1, 2004

Unlike traditional clustering methods that focus on grouping objects with similar values on a set of dimensions, clustering by pattern similarity finds objects that exhibit a coherent pattern of rise and fall in subspaces. Pattern-based clustering extends the concept of traditional clustering and benefits a wide range of applications, including large scale scientific data analysis, target marketing, web usage analysis, etc. However, state-of-the-art pattern-based clustering methods (e.g., the pCluster algorithm) can only handle datasets of thousands of records, which makes them inappropriate for many real-life applications. Furthermore, besides the huge data volume, many data sets are also characterized by their seguentiality, for instance, customer purchase records and network event logs are usually modeled as data sequences. Hence, it becomes important to enable pattern-based clustering methods i) to handle large datasets, and ii) to discover pattern similarity embedded in data sequences. In this paper, we present a novel algorithm that offers this capability. Experimental results from both real life and synthetic datasets prove its effectiveness and efficiency.

Duke Scholars

Author Jian Pei Computer Science

Published In

Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm

DOI

10.1109/SSDM.2004.1311193

ISSN

1099-3371

Publication Date

January 1, 2004

Volume

Start / End Page

51 / 60

Citation

APA

Chicago

ICMJE

MLA

NLM

Wang, H., Chu, F., Fan, W., Philip, S. Y., & Pei, J. (2004). A fast algorithm for subspace clustering by pattern similarity. In Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm (Vol. 16, pp. 51–60). https://doi.org/10.1109/SSDM.2004.1311193

Wang, H., F. Chu, W. Fan, S. Y. Philip, and J. Pei. “A fast algorithm for subspace clustering by pattern similarity.” In Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm, 16:51–60, 2004. https://doi.org/10.1109/SSDM.2004.1311193.

Wang H, Chu F, Fan W, Philip SY, Pei J. A fast algorithm for subspace clustering by pattern similarity. In: Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm. 2004. p. 51–60.

Wang, H., et al. “A fast algorithm for subspace clustering by pattern similarity.” Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm, vol. 16, 2004, pp. 51–60. Scopus, doi:10.1109/SSDM.2004.1311193.

Wang H, Chu F, Fan W, Philip SY, Pei J. A fast algorithm for subspace clustering by pattern similarity. Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm. 2004. p. 51–60.

Published In

Proceedings of the International Conference on Scientific and Statistical Database Management Ssdbm

DOI

10.1109/SSDM.2004.1311193

ISSN

1099-3371

Publication Date

January 1, 2004

Volume

Start / End Page

51 / 60