Skip to main content
Journal cover image

Mining frequent patterns without candidate generation: A frequent-pattern tree approach

Publication ,  Journal Article
Han, J; Pei, J; Yin, Y; Mao, R
Published in: Data Mining and Knowledge Discovery
January 1, 2004

Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist a large number of patterns and/or long patterns. In this study, we propose a novel frequent-pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a condensed, smaller data structure, FP-tree which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern-fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent-pattern mining methods.

Duke Scholars

Altmetric Attention Stats
Dimensions Citation Stats

Published In

Data Mining and Knowledge Discovery

DOI

ISSN

1384-5810

Publication Date

January 1, 2004

Volume

8

Issue

1

Start / End Page

53 / 87

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences
  • 0806 Information Systems
  • 0804 Data Format
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Han, J., Pei, J., Yin, Y., & Mao, R. (2004). Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery, 8(1), 53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83
Han, J., J. Pei, Y. Yin, and R. Mao. “Mining frequent patterns without candidate generation: A frequent-pattern tree approach.” Data Mining and Knowledge Discovery 8, no. 1 (January 1, 2004): 53–87. https://doi.org/10.1023/B:DAMI.0000005258.31418.83.
Han J, Pei J, Yin Y, Mao R. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery. 2004 Jan 1;8(1):53–87.
Han, J., et al. “Mining frequent patterns without candidate generation: A frequent-pattern tree approach.” Data Mining and Knowledge Discovery, vol. 8, no. 1, Jan. 2004, pp. 53–87. Scopus, doi:10.1023/B:DAMI.0000005258.31418.83.
Han J, Pei J, Yin Y, Mao R. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery. 2004 Jan 1;8(1):53–87.
Journal cover image

Published In

Data Mining and Knowledge Discovery

DOI

ISSN

1384-5810

Publication Date

January 1, 2004

Volume

8

Issue

1

Start / End Page

53 / 87

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 46 Information and computing sciences
  • 0806 Information Systems
  • 0804 Data Format
  • 0801 Artificial Intelligence and Image Processing