LTH-image

Community Detection: Fundamental Limits and Optimal Sampling Algorithms

Alexandre Proutiere, Kungliga Tekniska Högskolan, Sweden

Abstract:

We consider networks consisting of a finite number of non-overlapping communities. To extract these communities, the interaction between pairs of nodes may be sampled from a large available data set, which allows a given node pair to be sampled several times. When a node pair is sampled, the observed outcome is a binary random variable, equal to 1 if nodes interact and to 0 otherwise. The outcome is more likely to be positive if nodes belong to the same communities. For a given budget of node pair samples or observations, we wish to jointly design a sampling strategy (the sequence of sampled node pairs) and a clustering algorithm that recover the hidden communities with the highest possible accuracy. We consider both  non-adaptive and adaptive sampling strategies, and for both classes of strategies, we derive fundamental performance limits satisfied by any sampling and clustering algorithm. In particular, we provide necessary conditions for the existence of algorithms recovering the communities accurately as the network size grows large. We also devise simple algorithms that accurately reconstruct the communities when this is at all possible, hence proving that the proposed necessary conditions for accurate community detection are also sufficient. The classical problem of community detection in the stochastic block model can be seen as a particular instance of the problems consider here. But our framework covers more general scenarios where the sequence of sampled node pairs can be designed in an adaptive manner. We provide new results for the stochastic block model, and extends the analysis to the case of adaptive sampling. This is a joint work with Se-Young Yun.

 

Presentation slides

Biography:Researcher at Microsoft Research (Cambridge) from 2007 to 2011, Research Engineer at France Telecom R&D from 2000 to 2006, Invited lecturer and researcher at the computer science department ENS Paris from 2004 to 2006. Education: PhD in Applied Mathematics from Ecole Polytechnique, Graduated in Mathematiques from Ecole Normale Superieure, Engineering degree from Telecom Paris, Engineer from Corps des Mines. Awards: ACM Sigmetrics rising star award in 2009, ACM best papers awards at Sigmetrics 2004 and 2010, and Mobihoc 2009.