Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2018, Vol. 58 Issue (12): 1066-1071    DOI: 10.16511/j.cnki.qhdxxb.2018.26.050
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
面向大规模时序图SimRank的计算方法
苗壮, 袁野, 乔百友, 王一舒, 马玉亮, 王国仁
东北大学 计算机科学与工程学院, 沈阳 110819
SimRank calculations for large temporal graphs
MIAO Zhuang, YUAN Ye, QIAO Baiyou, WANG Yishu, MA Yuliang, WANG Guoren
School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China
全文: PDF(1026 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 顶点相似度计算在现实生活中具有广泛的应用。当前对相似性计算的研究工作主要集中于静态图上,并且大多相似性计算模型是基于SimRank算法提出的。而现实中的许多场景,需采用时序图进行建模。当前针对静态图的大量SimRank的计算方法无法在时序图中实现,因此该文对大规模时序图中的SimRank计算开展详细研究,并提出一种时序关联的SimRank计算方法(temporal-aware SimRank,TaSimRank)。TaSimRank根据图的拓扑结构和时间约束通过高效的迭代方法计算SimRank。同时,该文提出一种近似算法,通过随机游走方法建立树形索引,使用Monte Carlo方法近似计算顶点的相似度,取得时间和效率的平衡。最后,通过大量真实实验验证了提出算法的有效性和可扩展性。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
苗壮
袁野
乔百友
王一舒
马玉亮
王国仁
关键词 时序图相似度随机游走    
Abstract:Similarity calculations have many real life applications. The research on similarity calculations have mainly been focused on static graphs with many similarity calculation models based on SimRank. In real life, many systems, such as communication networks, are modeled by temporal graphs. However, the traditional SimRank algorithm cannot be implemented in temporal graphs. Therefore, this study analyzes the similarity calculation problem for large temporal graphs. A temporal-aware SimRank (TaSimRank) algorithm was developed to compute the node similarity through an efficient iterative method based on the topological structure and time constraints of the graph. An approximate algorithm is then used to implement the similarity calculations using a tree-based index built by a random walk and the Monte Carlo method. The algorithm balances the calculational time and efficiency. Tests on real temporal graphs demonstrate the effectiveness and extensibility of these approaches.
Key wordstemporal graph    similarity    random walk
收稿日期: 2018-07-21      出版日期: 2018-12-13
基金资助:国家重点研发计划资助项目(2016YFC1401900);国家自然科学基金资助项目(61572119,61622202)
通讯作者: 袁野,教授,E-mail:yuanye@ise.neu.edu.cn     E-mail: yuanye@ise.neu.edu.cn
引用本文:   
苗壮, 袁野, 乔百友, 王一舒, 马玉亮, 王国仁. 面向大规模时序图SimRank的计算方法[J]. 清华大学学报(自然科学版), 2018, 58(12): 1066-1071.
MIAO Zhuang, YUAN Ye, QIAO Baiyou, WANG Yishu, MA Yuliang, WANG Guoren. SimRank calculations for large temporal graphs. Journal of Tsinghua University(Science and Technology), 2018, 58(12): 1066-1071.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2018.26.050  或          http://jst.tsinghuajournals.com/CN/Y2018/V58/I12/1066
  图1 时序图
  图2 静态图
  图3 路径融合
  图4 根据路径融合建立索引
  图5 索引生成算法
  图6 基于随机游走的SimRank近似计算方法
  表1 实验数据集
  图7 算法准确度
  图8 查询时间
  图9 索引建立时间
  图10 索引大小
[1] SALTON G, MCGILL M J. Introduction to modern information retrieval[J]. McGraw-Hill, 1983, 55(4):239-240.
[2] HERLOCKER J L, KONSTAN J A, TERVEEN L G, et al. Evaluating collaborative filtering recommender systems[J]. ACM Transactions on Information Systems, 2004, 22(1):5-53.
[3] HOLME P, SARAMAKI J. Temporal networks[J]. Physics Reports, 2012, 519(3):97-125.
[4] WU H H, CHENG J, HUANG S L, et al. Path problems in temporal graphs[J]. Proceedings of the VLDB Endowment, 2014, 7(9):721-732.
[5] WU H, HUANG Y, CHENG J, et al. Reachability and time-based path queries in temporal graphs[C]//IEEE, International Conference on Data Engineering. Helsinki, Finland:IEEE, 2016:145-156.
[6] JEH G, WIDOM J. SimRank:A measure of structural-context similarity[C]//Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Edmonton, Alberta, Canada:ACM, 2002:538-543.
[7] YU W R, LIN X M, ZHANG W J. Towards efficient SimRank computation on large networks[C]//Proceedings of 2013 IEEE International Conference on Data Engineering. Brisbane, Australia:IEEE, 2013:601-612.
[8] LIZORKIN D, VELIKHOV P, GRINEV M, et al. Accuracy estimate and optimization techniques for SimRank computation[J]. The VLDB Journal, 2010, 19(1):45-66.
[9] LI C P, HAN J W, HE G M, et al. Fast computation of SimRank for static and dynamic information networks[C]//Proceedings of the 13th International Conference on Extending Database Technology. Lausanne, Switzerland:ACM, 2010:465-476.
[10] YU W R, LIN X M, ZHANG W J. Fast incremental SimRank on link-evolving graphs[C]//Proceedings of IEEE International Conference on Data Engineering. Chicago, USA:IEEE, 2014:304-315.
[11] SHAO Y X, CUI B, CHEN L, et al. An efficient similarity search framework for SimRank over large dynamic graphs[J]. Proceedings of the VLDB Endowment, 2015, 8(8):838-849.
[12] LI R Q, ZHAO X, SHANG H C, et al. Fast top-k similarity join for SimRank[J]. Information Sciences, 2017, 381:1-19.
[13] JIANG M H, FU A W C, WONG R C W. READS:A random walk approach for efficient and accurate dynamic SimRank[J]. Proceedings of the VLDB Endowment, 2017, 10(9):937-948.
[14] FOGARAS D, RÁCZ B. Scaling link-based similarity search[C]//Proceedings of the 14th International Conference on World Wide Web. Chiba, Japan:ACM, 2005:641-650.
[15] JOHNSON R W. An introduction to the bootstrap[J]. Teaching Statistics, 2001, 23(2):49-54.
[1] 宋宇波, 祁欣妤, 黄强, 胡爱群, 杨俊杰. 基于二阶段多分类的物联网设备识别算法[J]. 清华大学学报(自然科学版), 2020, 60(5): 365-370.
[2] 王鑫, 徐祖华, 赵均, 邵之江. 基于梯度特征的分布曲线模型预测控制算法[J]. 清华大学学报(自然科学版), 2019, 59(5): 403-408.
[3] 王元龙, 李茹, 张虎, 王智强. 阅读理解中因果关系类选项的研究[J]. 清华大学学报(自然科学版), 2018, 58(3): 272-278.
[4] 贾楠, 刘呈, 钱静. 基于安全系统相似机理模型的事故分析[J]. 清华大学学报(自然科学版), 2018, 58(11): 1006-1012.
[5] 闫健恩, 张兆心, 许海燕, 张宏莉. 基于命令语法结构特征的IRC僵尸网络频道检测[J]. 清华大学学报(自然科学版), 2017, 57(9): 914-920.
[6] 刘洪玉, 李妍. 基于模糊数学的房地产批量评估[J]. 清华大学学报(自然科学版), 2017, 57(11): 1202-1206.
[7] 邓辉, 刘晖, 张宝峰, 毛军捷, 郭颖, 熊琦, 谢仕华. 面向复杂网络的威胁度量及聚合方法[J]. 清华大学学报(自然科学版), 2016, 56(5): 511-516.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn