用户名: 密码: 验证码:
基于图勾勒的图链路预测方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Graph sketches-based link prediction over graph data
  • 作者:尤洁 ; 李劲 ; 张赛 ; 李婷
  • 英文作者:YOU Jie;LI Jin;ZHANG Sai;LI Ting;School of Software, Yunnan University;Key Laboratory in Software Engineering of Yunnan Province;
  • 关键词:图数据 ; 算法复杂度 ; 链路预测 ; 图勾勒 ; 节点相似性 ; 并行计算 ; Apache ; Spark
  • 英文关键词:graph data;;algorithm complexity;;link-prediction;;graph sketches;;nodes similarity;;parallel computing;;Apache Spark
  • 中文刊名:ZNXT
  • 英文刊名:CAAI Transactions on Intelligent Systems
  • 机构:云南大学软件学院;云南省软件工程重点实验室;
  • 出版日期:2019-01-10 11:02
  • 出版单位:智能系统学报
  • 年:2019
  • 期:v.14;No.78
  • 基金:国家自然科学基金项目(61562091);; 云南省应用基础研究计划面上项目(2016FB110)
  • 语种:中文;
  • 页:ZNXT201904022
  • 页数:8
  • CN:04
  • ISSN:23-1538/TP
  • 分类号:161-168
摘要
针对已有链路预测算法复杂度高,不适于在大规模图上进行链接预测的问题,本文基于图勾勒近似技术对已有链路预测方法进行优化,提出了基于图勾勒的链路预测方法。该方法将链路预测算法的计算复杂度由在真实图数据集上进行测试,实验结果表明本文方法在保证链接预测精度的前提下,可有效提升算法效率。
        The high computational complexity of existing link prediction algorithms makes them unsuitable for link prediction on large-scale graphs.To solve this problem,we propose a novel link prediction approach that involves combining the existing link prediction approaches with graph sketch approximation.Our proposed approach reduces the computation complexity of link prediction from O(n~3)to O(n~2 k~2 log~2 n) Furthermore,to enhance the efficiency of our approach;we also provide a parallel link prediction algorithm,which is implemented on the parallel computing framework Apache Spark.Finally,we conducted extensive experiments on a real network dataset to test the validation and efficiency of our approach.The experimental results indicate that our methods can effectively improve the efficiency of link prediction while guaranteeing prediction accuracy as well.
引文
[1]吕琳媛.复杂网络链路预测[J].电子科技大学学报,2010,39(5):651-661.LYU Linyuan.Link prediction on complex networks[J].Journal of University of Electronic Science and Technology of China,2010,39(5):65 1-661.
    [2]CLAUSET A,MOORE C,NEWMAN M E J.Hierarchical structure and the prediction of missing links in networks[J].Nature,2008,453(7191):98-101.
    [3]AIROLDI E M,BLEI D M,FIENBERG S E,et al.Mixed membership stochastic blockmodels[J].Journal of machine learning research,2008,9:1981-2014.
    [4]YU Kai,CHU Wei,YU Shipeng,et al.Stochastic relational models for discriminative link prediction[C]//Proceedings of the 19th International Conference on Neural Information Processing Systems.Vancouver,Canada,2006:1553-1560.
    [5]LIBEN-NOWELL D,KLEINBERG J.The link-prediction problem for social networks[J].Journal of the association for information science and technology,2007,5 8(7):1019-1031.
    [6]ADAMIC L A,ADAR E.Friends and neighbors on the Web[J].Social networks,2003,25(3):21 1-230.
    [7]LYU Linyuan,JIN Cihang,ZHOU Tao.Similarity index based on local paths for link prediction of complex networks[J].Physical review E,2009,80(4):046122.
    [8]ZHOU Tao,Lu Linyuan,ZHANG Yicheng.Predicting missing links via local information[J].The European physical journal B,2009,71(4):623-630.
    [9]KATZ L.A new status index derived from sociometric analysis[J].Psychometrika,1953,18(1):39-43.
    [10]LEICHT E A,HOLME P,NEWMAN M E J.Vertex similarity in networks[J].Physical review E,2006,73:026120.
    [11]KLEIN D J,RANDIC M.Resistance distance[J].Journal of mathematical chemistry,1993,12(1):81-95.
    [12]FOUSS F,PIROTTE A,RENDERS J M,et al.Randomwalk computation of similarities between nodes of a graph with application to collaborative recommendation[J].IEEE transactions on knowledge and data engineering,2007,19(3):355-369.
    [13]BRIN S,PAGE L.The anatomy of a large-scale hypertextual Web search engine[J].Computer networks and ISDN systems,1998,30(1-7):107-117.
    [14]JEH G,WIDOM J.SimRank:a measure of structuralcontext similarity[C]//Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Edmonton,Alberta,Canada,2002:538-543.
    [15]饶君,吴斌,东昱晓.MapReduce环境下的并行复杂网络链路预测[J].软件学报,2012,23(12):3175-3186.RAO Jun,WU Bin,DONG Yuxiao.Parallel link prediction in complex network using mapreduce[J].Journal of software,2012,23(12):3175-3186.
    [16]COHEN E.All-distances sketches[M]//KAO M Y.Encyclopedia of Algorithms.New York:Springer,2016:2320-2334.
    [17]BATAGELJ V,MRVAR A.Pajek datasets[EB/OL].2006http://vlado.fmf.uni-lj.si/pub/networks/data/default.html.
    [18]BU Dongbo,ZHAO Yi,CAI Lun,et al.Topological structure analysis of the protein-protein interaction network in budding yeast[J].Nucleic acids research,2003,31(9):2443-2450.
    [19]WATTS D J,STROGATZ S H.Collective dynamics of'small-world'networks[J].Nature,1998,393(6684):440-442.
    [20]PEROZZI B,AL-RFOU R,SKIENA S.DeepWalk:online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,USA,2014:701-710
    [21]BREITKREUTZ B J,STARK C,REGULY T,et al.The bioGRID interaction database:2008 update[J].Nucleic acids research,2008,36(S1):D637-D640.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700