Top-k subgraph matching is a key operation in graph queries which is widely used in all kinds of applications such as social networks and knowledge graphs. Top-k subgraph matching is more challenging in distributed environments since it involves data and task transfers. Thus, existing local Top-k subgraph matching methods do not work well in distributed environments. An algorithm was developed to address this issue by dividing the problem into query decomposition, query execution, and ranked join stages. The query decomposition avoids unnecessary data transfers during the querying stage. The ranked join technique avoids generating unnecessary temporal results that reduces the overall latency of the algorithm. The algorithm effectiveness and efficiency were tested using real data and the results indicate that the algorithm, especially the optimized version, effectively solves the distributed Top-k subgraph matching problem.
LAN Chao
,
ZHANG Yong
,
XING Chunxiao
. Distributed Top-k subgraph matching[J]. Journal of Tsinghua University(Science and Technology), 2016
, 56(8)
: 871
-877
.
DOI: 10.16511/j.cnki.qhdxxb.2016.25.026
[1] Gupta M,Gao J,Yan X,et al.Top-k interesting subgraph discovery in information networks[C]//ICDE2014.Chicago,IL,USA:IEEE,2014:820-831.
[2] Huang J,Abadi D J,Ren K.Scalable SPARQL querying of large RDF graphs[J].Proc VLDB Endow,2011,4(11):1123-1134.
[3] Yang S,Wu Y,Sun H,et al.Schemaless and structureless graph querying[J].Proc VLDB Endow,2014,7(7):565-576.
[4] Khan A,Wu Y,Aggarwal C C,et al.NeMa:Fast graph search with label similarity[J].Proc VLDB Endow,2013,6(3):181-192.
[5] Gonzalez J E,Low Y,Gu H,et al.PowerGraph:Distributed graph-parallel computation on natural graphs[C]//USENIX 2012.Hollywood,CA,USA:USENIX Association,2012:17-30.
[6] Albert R,Jeong H,Barabási A.Error and attack tolerance of complex networks[J].Nature,2000,406(6794):378-382.
[7] Abou-Rjeili A,Karypis G.Multilevel algorithms for partitioning power-law graphs[C]//Proceedings of the 20th International Conference on Parallel and Distributed Processing.Rhodes Island,Greece:IEEE Computer Society,2006:124-124.
[8] Xin R S,Gonzalez J E,Franklin M J,et al.GraphX:A resilient distributed graph system on spark[C]//First International Workshop on Graph Data Management Experiences and Systems.New York,USA:ACM,2013:1-6.
[9] Ullmann J R.An algorithm for subgraph isomorphism[J].J ACM,1976,23(1):31-42.
[10] He H,Wang H,Yang J,et al.BLINKS:Ranked keyword searches on graphs[C]//SIGMOD2007.New York,USA:ACM,2007:305-316.
[11] Sun Z,Wang H,Wang H,et al.Efficient subgraph matching on billion node graphs[J].Proc VLDB Endow,2012,5(9):788-799.
[12] Gou G,Chirkova R.Efficient algorithms for exact ranked twig-pattern matching over graphs[C]//SIGMOD2008.New York,USA:ACM,2008:581-594.
[13] Ilyas I F,Beskales G,Soliman M A.A survey of top-k query processing techniques in relational database systems[J].ACM Comput Surv,2008,40(4):1-58.
[14] Dbpedia.Dbpedia[Z/OL].(2015-01-01).http://wiki.dbpedia.org/.