Community detection constitutes an important task for investigating the internal structure of networks, with a plethora of applications in a wide range of disciplines. A particularly important point, which is rarely taken into account while developing community detection algorithms, is their sensitivity (or stability) to network uncertainty. In many cases, the input graph data is incomplete or noisy (e.g., due to noise introduced during the collection of the data or for privacy preserving reasons). Then, the following question arises: how stable are the results produced by an algorithm with respect to the uncertainty (i.e., noise level) of the input data? In this project, we propose a quantitative way to address this problem.We have considered several graph perturbation models to introduce uncertainty to the graph. Then, we examine the sensitivity of an algorithm, with respect to functional and structural characteristics of the detected communities under various perturbation levels. We have studied the performance of some of the most widely used community detection algorithms in practice, and our experimental results indicate that random walk based community detection algorithms tend to be robust under various conditions of network uncertainty.
Name | #Nodes | #Edges | Type | Source | Download |
---|---|---|---|---|---|
Ca-Gr-Qc | 5.24K | 28.98K | Co-authorship graph | SNAP | Link |
Ca-Hep-Th | 9.87K | 51.97K | Co-authorship graph | SNAP | Link |
Wiki-Vote | 7.11K | 103.68K | Wikipedia who-votes-whom graph | SNAP | Link |
AS-Caida | 16.30K | 65.91K | Autonomous systems internet graph | SNAP | Link |
P2P-Gnutella | 6.30K | 20.77K | Gnutella peer to peer network from August 8, 2002 | SNAP | Link |