摘要
分词是Python中的一项重要应用,实现分词功能的工具有很多种,如jieba、Snow NLP、THULAC、NLPIR等。词云是在分词的基础上设计并实现的,它提供阅读整个信息的重点,揭示关键概念,并可使用不同的展示形式,以有趣、高效、新颖的方式呈现给阅读者。在此,以中文分词为例,详细介绍使用jieba库和wordcloud库实现词云的设计与优化。
Word segmentation is an important application in Python. There are many tools for implementing word segmentation, such as jieba, SnowNLP, THULAC, NLPIR, etc. The word cloud is designed and implemented on the basis of word segmentation. It provides the focus of reading of the entire information, reveals key concepts,and can be presented to readers in fun, efficient and novel ways using different presentation forms. Taking Chinese word segmentation as an example, the design and optimization of word cloud using jieba library and wordcloud library are introduced in detail.
引文
[1]严明,郑昌兴.Python环境下的文本分词与词云制作.现代计算机,2018(34),86-89
[2]杨涛.中文信息处理中的自动分词方法研究.现代交际,2019(07):93-95
[3]自动分词算法的分类,https://www.cnblogs.com/jony413/articles/3179484.html,2018,4,1
[4]嵩天,礼欣,黄天羽.Python语言程序设计基础.北京:高等教育出版社,2017
[5]冯与诘.词云生成系统的构建.通讯世界,2019,26(3):190-192
[6]何冠辰.人工智能与中文分词的研究.中国新通信,2019,21(04):66-68
[7]词云百度百科,https://baike.baidu.com/item/%E8%AF%8D%E4%BA%91/6952822,2017,11,25
[8]Python数据可视化之Wordcloud,https://www.jianshu.com/p/daa54db9045d,2018,6,18
[9]jieba库及wordcloud库的使用,https://www.cnblogs.com/wyb666/p/9119538.html,2019,5,1
[10]2019年国务院政府工作报告全文,http://www.gov.cn/zhuanti/2019qgl h/2019lhzfgzbg/index.htm,2019,3,5