Skip to main content

Scalable mining of large disk-based graph databases

Publication ,  Conference
Wang, C; Wang, W; Pei, J; Zhu, Y; Shi, B
Published in: KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
January 1, 2004

Mining frequent structural patterns from graph databases is an interesting problem with broad applications. Most of the previous studies focus on pruning unfruitful search subspaces effectively, but few of them address the mining on large, disk-based databases. As many graph databases in applications cannot be held into main memory, scalable mining of large, disk-based graph databases remains a challenging problem. In this paper, we develop an effective index structure, ADI(for adjacency index), to support mining various graph patterns over large databases that cannot be held into main memory. The index is simple and efficient to build. Moreover, the new index structure can be easily adopted in various existing graph pattern mining algorithms. As an example, we adapt the well-known gSpan algorithm by using the ADI structure. The experimental results show that the new index structure enables the scalable graph pattern mining over large databases. In one set of the experiments, the new disk-based method can mine graph databases with one million graphs, while the original gSpan algorithm can only handle databases of up to 300 thousand graphs. Moreover, our new method is faster than gSpan when both can run in main memory.

Duke Scholars

Published In

KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

DOI

Publication Date

January 1, 2004

Start / End Page

316 / 325
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Wang, C., Wang, W., Pei, J., Zhu, Y., & Shi, B. (2004). Scalable mining of large disk-based graph databases. In KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 316–325). https://doi.org/10.1145/1014052.1014088
Wang, C., W. Wang, J. Pei, Y. Zhu, and B. Shi. “Scalable mining of large disk-based graph databases.” In KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 316–25, 2004. https://doi.org/10.1145/1014052.1014088.
Wang C, Wang W, Pei J, Zhu Y, Shi B. Scalable mining of large disk-based graph databases. In: KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004. p. 316–25.
Wang, C., et al. “Scalable mining of large disk-based graph databases.” KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2004, pp. 316–25. Scopus, doi:10.1145/1014052.1014088.
Wang C, Wang W, Pei J, Zhu Y, Shi B. Scalable mining of large disk-based graph databases. KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2004. p. 316–325.

Published In

KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

DOI

Publication Date

January 1, 2004

Start / End Page

316 / 325