用户名: 密码: 验证码:
Graph-Based Approach for Cross Domain Text Linking
详细信息    查看全文
  • 关键词:Text graph ; Cross domain text ; Text linking ; Semantic similarity
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9461
  • 期:1
  • 页码:151-160
  • 全文大小:810 KB
  • 参考文献:1.Islam, A., Inkpen, D.: Semantic text similarity using corpus-based word similarity and string similarity. ACM Trans. Knowl. Disc. Data (TKDD) 2(2), 10:1–10:25 (2008)
    2.Gomaa, W.H., Fahmy, A.A.: A survey of text similarity approaches. Int. J. Comput. Appl. 68(13), 13–18 (2013)
    3.Zhan, Z., Yang, X., Computer, D.O., et al.: Text similarity calculation based on language network and semantic information. Comput. Eng. Appl. (2014)
    4.Shameem, M.U.S., Ferdous, R.: An efficient k-means algorithm integrated with Jaccard distance measure for document clustering. In: First Asian Himalayas International Conference on Internet (AH-ICI 2009). IEEE, pp. 1–6 (2009)
    5.Lan, Q.: Extraction of news content for text mining based on edit distance. J. Comput. Inf. Syst. 6(11), 3761–3777 (2010)
    6.Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRef
    7.Jimenez, S., Gonzalez, F., Gelbukh, A.: Text comparison using soft cardinality. In: Chavez, E., Lonardi, S. (eds.) String Processing and Information Retrieval. Lecture Notes in Computer Science, vol. 6393, pp. 297–302. Springer, Heidelberg (2010)CrossRef
    8.Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: National Conference on Artificial Intelligence, vol. 1, pp. 775–780 (2006)
    9.Fern, S., Stevenson, M.A.: Semantic similarity approach to paraphrase detection. In: Computational Linguistics UK Annual Research Colloquium (2008)
    10.Turney, P.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the Twelfth European Conference on Machine Learning (ECML-2001) (2001)
    11.Dumais, S.T.: Latent semantic analysis. Ann. Rev. Inf. Sci. Technol. 38(1), 188–230 (2004)CrossRef
    12.Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference On Uncertainty in Artificial Intelligence. Morgan Kaufmann Publishers Inc. (1999)
    13.Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
    14.Zhang, K., Zhu, K.Q., Hwang, S.-w.: An Association Network for Computing Semantic Relatedness (2015)
  • 作者单位:Yu Hu (19)
    Tiezheng Nie (19)
    Derong Shen (19)
    Yue Kou (19)

    19. College of Information Science and Engineering, Northeastern University, Shenyang, China
  • 丛书名:Web Technologies and Applications
  • ISBN:978-3-319-28121-6
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
Comprehensive analysis of multi-domain texts has generated an important effect on text mining. Although the objects described by these multi-domain texts belong to different fields, they sometimes are overlapped partially; and linking these texts fragments which are overlapped or complementary is a necessary step for many tasks, such as entity resolution, information retrieval and text clustering. Previous works for computing text similarity mainly focus on string-based, corpus-based and knowledge-based approaches. However cross-domain texts exhibit very special features compared to texts in the same domain: (1) entity ambiguity, texts from different domains may contain various references to the same entity; (2) content skewness, cross domain texts are overlapped partially. In this paper, we propose a novel fine-grained approach based on text graph for evaluating the semantic similarity of cross-domain texts to link the similar parts. The experiment results show that our approach gives an effective solution to discover the semantic relationship between cross domain text fragments.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700