用户名: 密码: 验证码:
基于Seq2Seq框架和领域知识图谱的新闻简报生成
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Seq2Seq framework and domain knowledge graph-based news summary generation technology
  • 作者:符悦 ; 白宇 ; 蔡东风
  • 英文作者:FU Yue;BAI Yu;CAI Dong-feng;Research Center for Human-computer Intelligence,Shenyang Aerospace University;
  • 关键词:领域知识图谱 ; 多文档文摘 ; 新闻简报 ; Seq2Seq框架
  • 英文关键词:domain knowledge graph;;multi-document abstraction;;news summary;;seq2seq framework
  • 中文刊名:HKGX
  • 英文刊名:Journal of Shenyang Aerospace University
  • 机构:沈阳航空航天大学人机智能研究中心;
  • 出版日期:2019-02-25
  • 出版单位:沈阳航空航天大学学报
  • 年:2019
  • 期:v.36;No.155
  • 基金:辽宁省自然科学基金(项目编号:20170540696);; 教育部人文社会科学研究青年基金(项目编号:17YJCZH003)
  • 语种:中文;
  • 页:HKGX201901012
  • 页数:11
  • CN:01
  • ISSN:21-1576/V
  • 分类号:81-91
摘要
新闻简报可以帮助人们在短时间内了解大量新闻内容,有效地解决信息过载问题。现有的基于多文档文摘技术的新闻简报生成研究多数仅限于考虑句子与句子之间的两两关系来对句子打分,进而通过句子排序罗列形成简报,这忽略了文本中句子与句子之间在主题层面的逻辑关系,使新闻简报缺乏可读性,用户阅读体验欠佳。提出了一种基于领域知识图谱的新闻简报生成方法,该方法结合Seq2Seq框架生成新闻的主题句,然后利用领域知识图谱中节点的主题相关性及节点之间的语义关联对新闻主题句进行组织生成简报。实验结果表明Seq2Seq框架和领域知识图谱应用在新闻简报的生成上,有效提高了新闻简报的连贯性、非冗余性和可读性。
        News Summaries can help people understand a large amount of news content in a short period and effectively solve the problem of information overload.The existing study of news summaries generation are based on multi-document abstraction technology.Most of them limit to considering the two-two relationship between sentence and sentence to score the sentence,and then to form a summary by sentence sorting.However,they ignore the logical relationship between the sentence and the sentence at the topic level in the text,making the news summaries lack of readability and the user′s reading experience is poor.This paper proposes news summaries generation method based on domain knowledge graph.This method combines the Seq2 Seq framework to generate the topic sentences of the news.Then our method organizes the news topic sentences by using the topic relevance of the nodes in the domain knowledge graph and the semantic association between the nodes to generate a news summary.The experimental results show that the Seq2 Seq framework and the domain knowledge graph application could achieve good results in the generation of news summaries.
引文
[1] TRAN G B,TRAN T,TRAN N K,et al.Leverage learning to rank in an optimization framework for timeline summarization[C]//TAIA Workshop at SIGIR.Dublin,2013.
    [2] ZIQIANG CAO,WENJIE LI,SUJIAN LI,et al.Improving Multi-Document Summarization via Text Classification[C].San Francisco,2017:3053-3059.
    [3] 秦兵,刘挺,李生.多文档自动文摘综述[J].中文信息学报,2005,19(6):13-20.
    [4] 江璐璐,胡珀,贝超.基于子主题增强的演化式多文档摘要生成[J].计算机工程,2018,44(7):172-176.
    [5] 李浥尘,胡珀,王丽君.基于神经网络的体育新闻自动生成研究[J].中文信息学报,2018,32(3):77-83.
    [6] HAHN U,MANI I.The chanllenges of automatic sumar-ization[J].Computer,2000,33(11):29-36.
    [7] LANGVILLE A N,MEYER C D.Google’s pagerank and beyongd:the science of search engine rankings[M].Princeton:Princeton University Press,2011.
    [8] MIHALCEA R,TARAU P.Textrank:bringing order into texts[C].Association for Computational Linguistics,Spain,2004:404-411.
    [9] BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
    [10] LIBIN YANG,XIAOYAN CAI,SHIRUI PAN,et al.Multi-document summarization based on sentence cluster using non-negative matrix factorization.Journal of Intelligent and Fuzzy Systems,2017,33(3):1867-1879.
    [11] LI L,ZHOU K,XUE G R,et al.Enhancing diversity,coverage and balance for summarization through structrue learning[C]//Proceedings of the 18th international conference on world wide web.New York,2009:71-80.
    [12] GUNAWAN D,PASARIBU A,RAHMAT R F,et al.Automatic text summarization for indonesian language using textteaser[J].2017,190(1):12048.
    [13] GENEST P,LAPALME G.Fully abstractive approach to guided summarization[C]//Proc of ACL.Cheju Island,2012:354-358.
    [14] SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[J].Advances in Neural Information Processing Systems,2014,4(27):3104-3112.
    [15] CHO K,VAN MERRIENBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].Computer Science,2014(6):1078.
    [16] SUMIT CHOPRA,MICHAEL AULI,ALEXANDER M RUSH.Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of NAACL,San Diego,2016.
    [17] ZHANG J,TAN J,WAN X.Towards a neural network approach to abstractive multi-document summarization[J].Computer Science,2018(1):1804.
    [18] LIRONG QIU,HUILI ZHANG.Review of development and construction of uyghur knowledge graph[C].CSE/EUC Guangzhou,2017:894-897.
    [19] 李阳,高大启.知识图谱中实体相似度计算研究[J].中文信息学报,2017,31(1):140-146.
    [20] KARIDI D P.From user graph to topics graph:towards twitter followee recommendation based on knowledge graphs[C]// IEEE,International Conference on Data Engineering Workshops.Strockholm,2016:121-123.
    [21] HU B,CHEN Q,ZHU F.LCSTS:a large scale Chinese short text summarization dataset[J].Computer Science,2015(5):2667-2671.
    [22] LIN C Y,HOVY E.Automatic evaluation of summaries using N-gram co-occurrence statistics[C]// Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Association for Computational Linguistics,Tokyo,2003:71-78.
    [23] ERKAN,RADEV,DRAGOMIR R.LexRank:graph-based lexical centrality as salience in text summarization[J].Journal of Qiqihar Junior Teachers College,2011,22(1):2004.
    [24] VANDERWENDE L,SUZUKI H,BROCKETT C,et al.Beyond sumbasic:task-focused summarization with sentence simplification and lexical expansion[J].Information Processing & Management,2007,43(6):1606-1618.
    [25] LUHN H P.The automatic creation of literature abstracts[M].New York:IBM Corp,1958.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700