用户名: 密码: 验证码:
GB-JER: A Graph-Based Model for Joint Entity Resolution
详细信息    查看全文
  • 作者:Chenchen Sun (17)
    Derong Shen (17)
    Yue Kou (17)
    Tiezheng Nie (17)
    Ge Yu (17)

    17. College of Information Science and Engineering
    ; Northeastern University ; Shenyang ; China
  • 关键词:Joint entity resolution ; Similarity propagation ; Structure ; based similarity ; Entity representation relationship graph
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9049
  • 期:1
  • 页码:458-473
  • 全文大小:437 KB
  • 参考文献:1. Benjelloun, O, Garcia-Molina, H, Menestrina, D, Su, Q, Whang, SE, Widom, J (2009) Swoosh: a generic approach to entity resolution. VLDB J. 18: pp. 255-276 CrossRef
    2. Bhattacharya, I, Getoor, L (2007) Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data 1: pp. 1-36 CrossRef
    3. Cohen, W., Ravikumar, P., Fienberg, S.: A comparison of string metrics for matching names and records. In: KDD Workshop on Data Cleaning and Object Consolidation, vol. 3, pp. 73鈥?8 (2003)
    4. Culotta, A., McCallum, A.: Joint deduplication of multiple record types in relational data. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 257鈥?58. ACM (2005)
    5. Dong, X., Halevy, A., Madhavan, J.: Reference reconciliation in complex information spaces. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 85鈥?6. ACM (2005)
    6. Elmagarmid, AK, Ipeirotis, PG, Verykios, VS (2007) Duplicate record detection: a survey. IEEE Transactions on Knowledge and Data Engineering 19: pp. 1-16 CrossRef
    7. Getoor, L, Diehl, CP (2005) Link mining: a survey. SIGKDD Explor. Newsl. 7: pp. 3-12 CrossRef
    8. Gruenheid, A., Dong, X.L., Srivastava, D.: Incremental record linkage. Proceedings of the VLDB Endowment 7(9) (2014)
    9. Herschel, M, Naumann, F, Szott, S, Taubert, M (2012) Scalable iterative graph duplicate detection. IEEE Transactions on Knowledge and Data Engineering 24: pp. 2094-2108 CrossRef
    10. Kalashnikov, D.V., Mehrotra, S., Chen, Z.: Exploiting relationships for domain-independent data cleaning. In: SDM, pp. 262鈥?73. SIAM (2005)
    11. Liben-Nowell, D, Kleinberg, J (2007) The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58: pp. 1019-1031 CrossRef
    12. McCallum, A., Nigam, K., Ungar, L.H.: Efficient clustering of high-dimensional data sets with application to reference matching. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2000, pp. 169鈥?78. ACM (2000)
    13. Motwani, R., Raghavan, P.: Randomized Algorithms. Cambridge University Press (1995)
    14. Nuray-Turan, R, Kalashnikov, DV, Mehrotra, S (2013) Adaptive connection strength models for relationship-based entity resolution. Journal of Data and Information Quality (JDIQ) 4: pp. 8
    15. Rastogi, V, Dalvi, N, Garofalakis, M (2011) Large-scale collective entity matching. Proceedings of the VLDB Endowment 4: pp. 208-218 CrossRef
    16. Singla, P., Domingos, P.: Entity resolution with markov logic. In: Sixth International Conference on Data Mining, ICDM 2006, pp. 572鈥?82. IEEE (2006)
    17. Sun, Y, Han, J (2012) Mining heterogeneous information networks: a structural analysis approach. SIGKDD Explorations 14: pp. 20-28 CrossRef
    18. Sun, Y, Han, J, Yan, X, Yu, PS, Wu, T (2011) Pathsim: meta path-based top-K similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment 4: pp. 992-1003
    19. Whang, SE, Marmaros, D, Garcia-Molina, H (2013) Pay-as-you-go entity resolution. IEEE Transactions on Knowledge and Data Engineering 25: pp. 1111-1124 CrossRef
  • 作者单位:Database Systems for Advanced Applications
  • 丛书名:978-3-319-18119-6
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
To resolve multiple classes of related entity representations jointly promotes accuracy of entity resolution. We propose a graph-based joint entity resolution model: GB-JER, who exploits a dynamic entity representation relationship graph. It contracts the neighborhood of the matched pair, where enrichment of semantics provides new evidences for subsequent entity resolution iteratively. Also GB-JER is an incremental approach. The experimental evaluation shows that GB-JER outperforms existing the state-of-the-art joint entity resolution approach in accuracy.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700