Skip to main content

Mining frequent cross-graph quasi-cliques

Publication ,  Journal Article
Jiang, D; Pei, J
Published in: ACM Transactions on Knowledge Discovery from Data
January 1, 2009

Joint mining of multiple datasets can often discover interesting, novel, and reliable patterns which cannot be obtained solely from any single source. For example, in bioinformatics, jointly mining multiple gene expression datasets obtained by different labs or during various biological processes may overcome the heavy noise in the data. Moreover, by joint mining of gene expression data and protein-protein interaction data, we may discover clusters of genes which show coherent expression patterns and also produce interacting proteins. Such clusters may be potential pathways. In this article, we investigate a novel data mining problem, mining frequent cross-graph quasi-cliques, which is generalized from several interesting applications in bioinformatics, cross-market customer segmentation, social network analysis, and Web mining. In a graph, a set of vertices S is a -quasi-clique (0 < 1) if each vertex v in S directly connects to at least (S - 1) other vertices in S. Given a set of graphs G1, , Gn and parameter min-sup (0 < min-sup 1), a set of vertices S is a frequent cross-graph quasi-clique if S is a -quasi-clique in at least min-sup n graphs, and there does not exist a proper superset of S having the property. We build a general model, show why the complete set of frequent cross-graph quasi-cliques cannot be found by previous data mining methods, and study the complexity of the problem. While the problem is difficult, we develop practical algorithms which exploit several interesting and effective techniques and heuristics to efficaciously mine frequent cross-graph quasi-cliques. A systematic performance study is reported on both synthetic and real data sets. We demonstrate some interesting and meaningful frequent cross-graph quasi-cliques in bioinformatics. The experimental results also show that our algorithms are efficient and scalable. © 2009 ACM.

Duke Scholars

Published In

ACM Transactions on Knowledge Discovery from Data

DOI

EISSN

1556-472X

ISSN

1556-4681

Publication Date

January 1, 2009

Volume

2

Issue

4

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4606 Distributed computing and systems software
  • 4605 Data management and data science
  • 4604 Cybersecurity and privacy
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Jiang, D., & Pei, J. (2009). Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data, 2(4). https://doi.org/10.1145/1460797.1460799
Jiang, D., and J. Pei. “Mining frequent cross-graph quasi-cliques.” ACM Transactions on Knowledge Discovery from Data 2, no. 4 (January 1, 2009). https://doi.org/10.1145/1460797.1460799.
Jiang D, Pei J. Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data. 2009 Jan 1;2(4).
Jiang, D., and J. Pei. “Mining frequent cross-graph quasi-cliques.” ACM Transactions on Knowledge Discovery from Data, vol. 2, no. 4, Jan. 2009. Scopus, doi:10.1145/1460797.1460799.
Jiang D, Pei J. Mining frequent cross-graph quasi-cliques. ACM Transactions on Knowledge Discovery from Data. 2009 Jan 1;2(4).

Published In

ACM Transactions on Knowledge Discovery from Data

DOI

EISSN

1556-472X

ISSN

1556-4681

Publication Date

January 1, 2009

Volume

2

Issue

4

Related Subject Headings

  • Artificial Intelligence & Image Processing
  • 4606 Distributed computing and systems software
  • 4605 Data management and data science
  • 4604 Cybersecurity and privacy
  • 0806 Information Systems
  • 0801 Artificial Intelligence and Image Processing