Scholars@Duke publication: H-Mine: Fast and space-preserving frequent pattern mining in a large databases

H-Mine: Fast and space-preserving frequent pattern mining in a large databases

Publication , Journal Article

Pei, J; Han, J; Lu, H; Nishio, S; Tang, S; Yang, D

Published in: IIE Transactions Institute of Industrial Engineers

June 1, 2007

In this study, we propose a simple and novel data structure using hyper-links, H-struct, and a new mining algorithm, H-mine, which takes advantage of this data structure and dynamically adjusts links in the mining process. A distinct feature of this method is that it has a very limited and precisely predictable main memory cost and runs very quickly in memory-based settings. Moreover, it can be scaled up to very large databases using database partitioning. When the data set becomes dense, (conditional) FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has an excellent performance for various kinds of data, outperforms currently available algorithms in different settings, and is highly scalable to mining large databases. This study also proposes a new data mining methodology, space-preserving mining, which may have a major impact on the future development of efficient and scalable data mining methods.

Duke Scholars

Author Jian Pei Computer Science

Published In

IIE Transactions Institute of Industrial Engineers

DOI

10.1080/07408170600897460

EISSN

1545-8830

ISSN

0740-817X

Publication Date

June 1, 2007

Volume

Issue

Start / End Page

593 / 605

Related Subject Headings

Operations Research
49 Mathematical sciences
40 Engineering
35 Commerce, management, tourism and services
15 Commerce, Management, Tourism and Services
09 Engineering
01 Mathematical Sciences

Citation

APA

Chicago

ICMJE

MLA

NLM

Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., & Yang, D. (2007). H-Mine: Fast and space-preserving frequent pattern mining in a large databases. IIE Transactions Institute of Industrial Engineers, 39(6), 593–605. https://doi.org/10.1080/07408170600897460

Pei, J., J. Han, H. Lu, S. Nishio, S. Tang, and D. Yang. “H-Mine: Fast and space-preserving frequent pattern mining in a large databases.” IIE Transactions Institute of Industrial Engineers 39, no. 6 (June 1, 2007): 593–605. https://doi.org/10.1080/07408170600897460.

Pei J, Han J, Lu H, Nishio S, Tang S, Yang D. H-Mine: Fast and space-preserving frequent pattern mining in a large databases. IIE Transactions Institute of Industrial Engineers. 2007 Jun 1;39(6):593–605.

Pei, J., et al. “H-Mine: Fast and space-preserving frequent pattern mining in a large databases.” IIE Transactions Institute of Industrial Engineers, vol. 39, no. 6, June 2007, pp. 593–605. Scopus, doi:10.1080/07408170600897460.

Published In

IIE Transactions Institute of Industrial Engineers

DOI

10.1080/07408170600897460

EISSN

1545-8830

ISSN

0740-817X

Publication Date

June 1, 2007

Volume

Issue

Start / End Page

593 / 605

Related Subject Headings

Operations Research
49 Mathematical sciences
40 Engineering
35 Commerce, management, tourism and services
15 Commerce, Management, Tourism and Services
09 Engineering
01 Mathematical Sciences