Digraph Clustering by the BlueRed Method
We introduce a new method for vertex clustering or community detection on directed graphs (digraphs). The new method is an extension of the BlueRed method introduced initially for undirected graphs. Complementary to supervised or semisupervised classification, unsupervised graph clustering is indispensable to exploratory data analysis and knowledge discovery. Conventional graph clustering methods are fundamentally hindered in effectiveness and efficiency by either the resolution limit or various problems with resolution parameter selection. BlueRed is originative in analysis, modeling, and solution approach. Its clustering process is simple, fully autonomous and unsupervised. Among other potential impacts, BlueRed breaks new ground for high-Throughput, low-cost and high-performance graph clustering computation, as it has removed the barrier of parameter tuning/selection. We report benchmark studies with real-world graph data for evaluating the new method. The clustering results are in remarkable agreement with the ground truth labels. We also present an important study on the U.S. patent citation graph CITE75_99. More than a quarter of 3.7 million patents have no electronic records of category codes. With BlueRed, we are able to efficiently and economically give a semantic presentation of the patents without category codes.