Scholars@Duke publication: MaPle: A fast algorithm for maximal pattern-based clustering

MaPle: A fast algorithm for maximal pattern-based clustering

Publication , Conference

Pei, J; Zhang, X; Cho, M; Wang, H; Yu, PS

Published in: Proceedings IEEE International Conference on Data Mining Icdm

December 1, 2003

Pattern-based clustering is important in many applications, such as DNA micro-array data analysis, automatic recommendation systems and target marketing systems. However, pattern-based clustering in large databases is challenging. On the one hand, there can be a huge number of clusters and many of them can be redundant and thus make the pattern-based clustering ineffective. On the other hand, the previous proposed methods may not be efficient or scalable in mining large databases. In this paper, we study the problem of maximal pattern-based clustering. Redundant clusters are avoided completely by mining only the maximal pattern-based clusters. MaPle, an efficient and scalable mining algorithm is developed. It conducts a depth-first, divide-and-conquer search and prunes unnecessary branches smartly. Our extensive performance study on both synthetic data sets and real data sets shows that maximal pattern-based clustering is effective. It reduces the number of clusters substantially. Moreover, MaPle is more efficient and scalable than the previously proposed pattern-based clustering methods in mining large databases. © 2003 IEEE.

Duke Scholars

Author Jian Pei Computer Science

Published In

Proceedings IEEE International Conference on Data Mining Icdm

ISSN

1550-4786

Publication Date

December 1, 2003

Start / End Page

259 / 266

Citation

APA

Chicago

ICMJE

MLA

NLM

Pei, J., Zhang, X., Cho, M., Wang, H., & Yu, P. S. (2003). MaPle: A fast algorithm for maximal pattern-based clustering. In Proceedings IEEE International Conference on Data Mining Icdm (pp. 259–266).

Pei, J., X. Zhang, M. Cho, H. Wang, and P. S. Yu. “MaPle: A fast algorithm for maximal pattern-based clustering.” In Proceedings IEEE International Conference on Data Mining Icdm, 259–66, 2003.

Pei J, Zhang X, Cho M, Wang H, Yu PS. MaPle: A fast algorithm for maximal pattern-based clustering. In: Proceedings IEEE International Conference on Data Mining Icdm. 2003. p. 259–66.

Pei, J., et al. “MaPle: A fast algorithm for maximal pattern-based clustering.” Proceedings IEEE International Conference on Data Mining Icdm, 2003, pp. 259–66.

Pei J, Zhang X, Cho M, Wang H, Yu PS. MaPle: A fast algorithm for maximal pattern-based clustering. Proceedings IEEE International Conference on Data Mining Icdm. 2003. p. 259–266.

Published In

Proceedings IEEE International Conference on Data Mining Icdm

ISSN

1550-4786

Publication Date

December 1, 2003

Start / End Page

259 / 266