用户名: 密码: 验证码:
基于条件随机场的维吾尔文机构名识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Uyghur organization name recognition based on conditional random fields
  • 作者:买合木提·买买提 ; 王路路 ; 吐尔根·依布拉音 ; 艾山·吾买尔 ; 卡哈尔江·阿比的热西提
  • 英文作者:Maihemuti Maimaiti;WANG Lu-lu;Tuergen Yibulayin;Aishan Wumaier;Kahaerjiang Abiderexiti;College of Information Science and Engineering,Xinjiang University;Xinjiang Laboratory of Multi-Language Information Technology,Xinjiang University;
  • 关键词:命名实体 ; 机构名识别 ; 维吾尔语 ; 条件随机场 ; 黏着语
  • 英文关键词:named entity;;organization name recognition;;Uyghur language;;conditional random field;;agglutinative language
  • 中文刊名:SJSJ
  • 英文刊名:Computer Engineering and Design
  • 机构:新疆大学信息科学与工程学院;新疆大学新疆多语种信息技术实验室;
  • 出版日期:2019-01-16
  • 出版单位:计算机工程与设计
  • 年:2019
  • 期:v.40;No.385
  • 基金:国家自然科学基金项目(61462083、61262060、61331011、61463048);; 国家973重点基础研究发展计划基金项目(2014CB340506);; “自治区青年科技创新人才培养工程”青年博士基金项目(QN2015BS004)
  • 语种:中文;
  • 页:SJSJ201901045
  • 页数:6
  • CN:01
  • ISSN:11-1775/TP
  • 分类号:281-286
摘要
为缓解目前维吾尔文机构名识别方法依赖于人工编写规则、识别效率低的问题,提出一种基于条件随机场模型(CRF)的维吾尔文机构名识别方法。根据维吾尔语的语言特性,结合词、词性、音节、机构名特征词表、地名词表等特征,实现维吾尔文机构名识别。实验结果表明,相比于基于规则的方法和隐马尔科夫模型(HMM),该方法不依赖于人工编写规则,识别的准确率和召回率较高。
        To alleviate the problems of heavily relying on manually written rules and low recognition efficiency in Uyghur organization name recognition,a conditional random field model(CRF)based Uyghur organization name recognition method was proposed.Uyghur organization names recognition was implemented,according to the linguistic characteristics of the Uyghur language,by combining the features of word,part of speech,syllable,the feature words of organization names,location names.Experimental results show that comparing to the rule-based method and hidden Markov model(HMM),the proposed method is independent of manually written rules,and high recognition accuracy and recall rates are achieved.
引文
[1]YAN Danhui,BI Yude.Rule-based recognition of vietnamese named entities[J].Journal of Chinese Information Processing,2014,28(5):198-205(in Chinese).[闫丹辉,毕玉德.基于规则的越南语命名实体识别研究[J].中文信息学报,2014,28(5):198-205.]
    [2]Yao X.A method of Chinese organization named entities recognition based on statistical word frequency,part of speech and length[C]//IEEE International Conference on Broadband Network and Multimedia Technology.IEEE,2012:637-641.
    [3]Swain C,Bansod PP,Janghel S.Name entity recognition by using maximum entropy[J].International Journal of Applied Engineering Research,2013,8(19):2241-2244.
    [4]Duan S,Zhou L,Zhou F.The recognition of Laos organization name based on a cascaded conditional random fields[C]//International Conference on Computer, Information and Telecommunication Systems.IEEE,2016:1-4.
    [5]HE Yanxiang,LUO Chuwei,HU Binyao.Geographic entity recognition method based on CRF model and rules combination[J].Computer Applications and Software,2015,32(1):179-185(in Chinese).[何炎祥,罗楚威,胡彬尧.基于CRF和规则相结合的地理命名实体识别方法[J].计算机应用与软件,2015,32(1):179-185.]
    [6]WU Jinxing,LI Li,YANG Zhenxin.Recognition of geographical names in Mongolian based on conditional random fields and dictionary[J].Computer Engineering and Science,2016,38(5):1046-1051(in Chinese).[吴金星,丽丽,杨振新.CRF和词典相结合的蒙古文地名识别研究[J].计算机工程与科学,2016,38(5):1046-1051.]
    [7]LU Yanan,SUN Rui,JI Donghong.Chinese named entity recognition based on position-sensitive Embedding[J].Application Research of Computers,2017,34(2):365-368(in Chinese).[鲁亚楠,孙锐,姬东鸿.基于位置敏感Embedding的中文命名实体识别[J].计算机应用研究,2017,34(2):365-368.]
    [8]ZHANG Hainan, WU Dayong,LIU Yue,et al.Chinese named entity recognition based on deep neural network[J].Journal of Chinese Information Processing,2017,31(4):28-35(in Chinese).[张海楠,伍大勇,刘悦,等.基于深度神经网络的中文命名实体识别[J].中文信息学报,2017,31(4):28-35.]
    [9]FENG Yuntian,ZHANG Hongjun, HAO Wenning,et al.Named entity recognition based on deep belief net[J].Computer Science,2016,43(4):224-230(in Chinese).[冯蕴天,张宏军,郝文宁,等.基于深度信念网络的命名实体识别[J].计算机科学,2016,43(4):224-230.]
    [10]WANG Hongbin,SHEN Qiang,XIAN Yantuan.Research on Chinese named entity recognition fusing transfer learning[J].Journal of Chinese Computer Systems,2017,38(2):346-351(in Chinese).[王红斌,沈强,线岩团.融合迁移学习的中文命名实体识别[J].小型微型计算机系统,2017,38(2):346-351.]
    [11]Tashpolat Nizamidin,WANG Kun,Askar Hamdulla,et al.Combination of statistical and rule-based approaches for Uyghur person name recognition[J].Acta Automatica Sinica,2017,43(4):653-664(in Chinese).[塔什甫拉提·尼扎木丁,汪昆,艾斯卡尔·艾木都拉,等.统计与规则相结合的维吾尔语人名识别方法[J].自动化学报,2017,43(4):653-664.]
    [12]Abdurahim Mahmoud,Hussein Yusuf,ZHANG Jiajun,et al.Name recognition in Uyghur language based on fuzzy matching and syllable-character conversion[J].Journal of Tsinghua University(Science and Technology),2017,57(2):188-196(in Chinese).[热合木·马合木提,于斯音·于苏普,张家俊,等.基于模糊匹配与音字转换的维吾尔语人名识别[J].清华大学学报(自然科学版),2017,57(2):188-196.]
    [13]Askar Rozi,ZONG Chengqing,Guljamal Mamateli,et al.Approach to recognizing Uyghur names based on conditional random fields[J].Journal of Tsinghua University(Science and Technology),2013,53(6):873-877(in Chinese).[艾斯卡尔·肉孜,宗成庆,姑丽加玛丽·麦麦提艾力,等.基于条件随机场的维吾尔人名识别方法[J].清华大学学报(自然科学版),2013,53(6):873-877.]
    [14]Muhtar Arkin,Askar Hamdulla,Dilmurat Tursun.Recognition of Uyghur place names based on rules[J].Communications Technology,2013,46(7):103-105(in Chinese).[木合塔尔·艾尔肯,艾斯卡尔·艾木都拉,地里木拉提·吐尔逊.基于规则的维吾尔地名识别[J].通信技术,2013,46(7):103-105.]
    [15]Ayiguli Halike,Hasan Wumaier,Tuergen Yibulayin,et al.Research on recognition and translation of Chinese-Uyghur time and numeral and quantifier[J].Journal of Chinese Information Processing,2016,30(6):190-200(in Chinese).[阿依古丽·哈力克,艾山·吾买尔,吐尔根·伊布拉音,等.汉维时间数字和量词的识别与翻译研究[J].中文信息学报,2016,30(6):190-200.]
    [16]Maihefureti,Mirigu Rouzi, Maierhaba Aili,et al.Uyghur organization name recognition based on syntactic and semantic knowledge[J].Computer Engineering and Design,2014,35(8):2944-2948(in Chinese).[麦合甫热提,米日姑·肉孜,麦热哈巴·艾力,等.基于语法语义知识的维吾尔文机构名识别[J].计算机工程与设计,2014,35(8):2944-2948.]

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700