Skip to main content

Finding theme communities from database networks

Publication ,  Conference
Chu, L; Zhang, Y; Wang, Z; Yang, Y; Pei, J; Chen, E
Published in: Proceedings of the VLDB Endowment
January 1, 2019

Given a database network where each vertex is associated with a transaction database, we are interested in finding theme communities. Here, a theme community is a cohesive subgraph such that a common pattern is frequent in all transaction databases associated with the vertices in the subgraph. Finding all theme communities from a database network enjoys many novel applications. However, it is challenging since even counting the number of all theme communities in a database network is #P-hard. Inspired by the observation that a theme community shrinks when the length of the pattern increases, we investigate several properties of theme communities and develop TCFI, a scalable algorithm that uses these properties to effectively prune the patterns that cannot form any theme community. We also design TC-Tree, a scalable algorithm that decomposes and indexes theme communities efficiently. Retrieving a ranked list of theme communities from a TC-Tree of hundreds of millions of theme communities takes less than 1 second. Extensive experiments and a case study demonstrate the effectiveness and scalability of TCFI and TC-Tree in discovering and querying meaningful theme communities from large database networks.

Duke Scholars

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2019

Volume

12

Issue

10

Start / End Page

1071 / 1084

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics
 

Citation

APA
Chicago
ICMJE
MLA
NLM
Chu, L., Zhang, Y., Wang, Z., Yang, Y., Pei, J., & Chen, E. (2019). Finding theme communities from database networks. In Proceedings of the VLDB Endowment (Vol. 12, pp. 1071–1084). https://doi.org/10.14778/3339490.3339492

Published In

Proceedings of the VLDB Endowment

DOI

EISSN

2150-8097

Publication Date

January 1, 2019

Volume

12

Issue

10

Start / End Page

1071 / 1084

Related Subject Headings

  • 4605 Data management and data science
  • 0807 Library and Information Studies
  • 0806 Information Systems
  • 0802 Computation Theory and Mathematics