Parallel incremental dynamic community detection algorithm based on Spark

WU Bin, XIAO Yan, ZHANG Yunlei

Journal of Tsinghua University(Science and Technology) ›› 2017, Vol. 57 ›› Issue (10) : 1030-1037.

PDF(1131 KB)
PDF(1131 KB)
Journal of Tsinghua University(Science and Technology) ›› 2017, Vol. 57 ›› Issue (10) : 1030-1037. DOI: 10.16511/j.cnki.qhdxxb.2017.25.041
COMPUTER SCIENCE AND TECHNOLOGY

Parallel incremental dynamic community detection algorithm based on Spark

  • {{article.zuoZhe_EN}}
Author information +
History +

Abstract

Dynamic community detection is a crucial step in researching network evolution. Nonparallel algorithms cannot efficiently mine community structures with large amounts of data. This paper presents a parallel incremental dynamic community detection algorithm based on Spark (PIDCDS), which maximizes the total permanence of the vertices in the network to discover the community structure. PIDCDS uses parallel computations on GraphX to calculate a specialized permanence as the community partition metric. Only the permanence of the incremental vertices is computed in each step to derive a new community structure. This algorithm is accurate with less computations. PIDCDS has better stability and more accurately detects the community structure than the FacetNet algorithm. PIDCDS performs well and the execution time increases only slowly as the network scale increases. More cores will accelerate PIDCDS to some extent.

Key words

parallel dynamic community detection / permanence / Spark / increment computation

Cite this article

Download Citations
WU Bin, XIAO Yan, ZHANG Yunlei. Parallel incremental dynamic community detection algorithm based on Spark[J]. Journal of Tsinghua University(Science and Technology). 2017, 57(10): 1030-1037 https://doi.org/10.16511/j.cnki.qhdxxb.2017.25.041

References

[1] 李国杰. 网络科学:21世纪的元科学[J]. 中国计算机学会通讯, 2016,12(4):7. LI Guojie.Network science:The meta science in the 21st century[J]. Communications of the China Computer Federation, 2016,12(4):7. (in Chinese)[2] Fortunato S. Community detection in graphs[J].Physics Reports, 2010,486(3):75-174.[3] Girvan M, Newman M E J. Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences, 2002,99(12):7821-7826.[4] Newman M E J. Modularity and community structure in networks[J].Proceedings of the National Academy of Sciences, 2006,103(23):8577-8582.[5] 汪小帆, 李翔, 陈关荣. 网络科学导论[M]. 北京:高等教育出版社, 2012. WANG Xiaofan, LI Xiang, CHEN Guanrong. Network Science:An Introduction[M]. Beijing:Higher Education Press, 2012. (in Chinese)[6] Chakraborty T, Srinivasan S, Ganguly N, et al. On the permanence of vertices in network communities[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA:ACM, 2014:1396-1405.[7] Ning H, Xu W, Chi Y, et al. Incremental spectral clustering by efficiently updating the eigen-system[J].Pattern Recognition,2010,43(1):113-127.[8] Zaharia M, Chowdhury M, Das T, et al. Resilient distributed datasets:A fault-tolerant abstraction for in-memory cluster computing[C]//Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation. Berkeley, CA,USA:USENIX Association, 2012:141-146.[9] Apache. Hadoop[EB/OL].[2016-07-29]. http://hadoop.apache.org.[10] Xin R S, Gonzalez J E, Franklin M J, et al. GraphX:A resilient distributed graph system on spark[C]//Proceedingsof the 1st Int Workshop on Graph Data Management Experiences and Systems. New York, NY, USA:ACM, 2013.[11] Lu H, Halappanavar M, Kalyanaraman A. Parallel heuristics for scalable community detection[J].Parallel Computing, 2015,47:19-37.[12] Staudt C L, Meyerhenke H. Engineering parallel algorithms for community detection in massive networks[J].IEEE Transactions on Parallel and Distributed Systems, 2016,27(1):171-184.[13] Danon L, Diaz-Guilera A, Duch J, et al. Comparing communitystructure identification[J].Journal of Statistical Mechanics:Theory and Experiment, 2005(9), P09008.[14] Lancichinetti A, Fortunato S, Radicchi F. Benchmark graphs for testing community detection algorithms[J].Physical Review E, 2008,78(4), 046110.[15] LI Xiaoming, WU Bin, GUO Qian, et al. Dynamic community detection algorithm based on incremental identification[C]//2015 IEEE International Conference on Data Mining Workshop (ICDMW). Atlantic City, NJ, USA:IEEE, 2015.[16] Lin Y R, Chi Y, Zhu S, et al. Facetnet:A framework for analyzing communities and their evolutions in dynamic networks[C]//Proceedings of the 17th international conference on World Wide Web. New York, NY, USA:ACM, 2008:685-694.
PDF(1131 KB)

Accesses

Citation

Detail

Sections
Recommended

/