用户名: 密码: 验证码:
The Portuguese B \(^2\) SG: A Semantic Test for Distributional Thesaurus
详细信息    查看全文
  • 关键词:Gold standard ; Semantic relations ; Distributional thesauri
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9727
  • 期:1
  • 页码:333-339
  • 全文大小:140 KB
  • 参考文献:1.Berber Sardinha, T., Moreira Filho, J., Alambert, E.: O corpus brasileiro. Comunicaçao ao VII Encontro de Lingüıstica de Corpus (2008)<br>2.Bick, E.: The parsing system Palavras. Automatic Grammatical Analysis of Portuguese in a Constraint Grammar Framework (2000)<br>3.Bond, F., Paik, K.: A survey of wordnets and their licenses. In: Proceedings of the 6th Global WordNet Conference. pp. 64–71 (2012)<br>4.Dias-da-Silva, B.C., Felippo, A.D., das Graças Volpe Nunes, M.: The automatic mapping of Princeton WordNet lexical-conceptual relations onto the brazilian portuguese wordnet database. In: Proceedings of LREC 2008, European Language Resources Association, Marrakech, Morocco (2008)<br>5.Fellbaum, C.: WordNet. Wiley Online Library, New York (1998)MATH <br>6.Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., Wang, Z.: New experiments in distributional representations of synonymy. In: Proceedings of the Ninth Conference on Computational Natural Language Learning. pp. 25–32. Association for Computational Linguistics (2005)<br>7.Gonçalo Oliveira, H., Gomes, P.: Towards the automatic creation of a wordnet from a term-based lexical network. In: Proceedings of the ACL Workshop TextGraphs-5: Graph-based Methods for Natural Language Processing. pp. 10–18. ACL Press (July 2010). http://​eden.​dei.​uc.​pt/​~hroliv/​pubs/​GoncaloOliveira_​Gomes2010_​TextGraphs5_​postconf.​pdf <br>8.Landauer, T.K., Dumais, S.T.: A solution to plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104(2), 211 (1997)CrossRef <br>9.Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics - vol. 2. pp. 768–774. ACL 1998, Association for Computational Linguistics (1998)<br>10.Marrafa, P.: WordNet do Português: uma base de dados de conhecimento linguístico. Instituto de Camões, Lisboa (2002)<br>11.Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, pp. 1045–1048, 26–30 September 2010<br>12.Navigli, R., Ponzetto, S.P.: Babelnet: building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. pp. 216–225. Association for Computational Linguistics (2010)<br>13.de Paiva, V., Rademaker, A., de Melo, G.: OpenWordNet-PT: an open Brazilian WordNet for reasoning. In: Proceedings of the 24th International Conference on Computational Linguistics (2012). http://​www.​coling2012-iitb.​org (Demonstration Paper). Published also asTechreport http://​hdl.​handle.​net/​10438/​10274 <br>14.Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP 14, 1532–1543 (2014)<br>15.Vossen, P. (ed.): EuroWordNet: A Multilingual Database with Lexical Semantic Networks. Kluwer Academic Publishers, Norwell (1998)MATH <br>16.Wilkens, R., Zilio, L., Gonçalves, G., Ferreira, E., Villavicencio, A.: Tesauros distribucionais para o português: avaliação de metodologias. In: Proceedings of STIL 2015. Sociedade Brasileira de Computação (2015)<br>
  • 作者单位:Rodrigo Wilkens (18) <br> Leonardo Zilio (18) <br> Eduardo Ferreira (18) <br> Aline Villavicencio (18) <br><br>18. Institute of Informatics, UFRGS, Porto Alegre, Brazil <br>
  • 丛书名:Computational Processing of the Portuguese Language
  • ISBN:978-3-319-41552-9
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics<br>Computer Communication Networks<br>Software Engineering<br>Data Encryption<br>Database Management<br>Computation by Abstract Devices<br>Algorithm Analysis and Problem Complexity<br>
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9727
文摘
The lack of availability of gold standards for evaluation of distributional thesauri is a stumbling block that prevents a direct comparison of alternative approaches in a uniform way. Here we present B\(^2\)SG, a TOEFL-like task for Portuguese that contains 2,875 tests with semantic relations (synonyms, antonyms and hypernyms) for nouns and verbs. The resource is validated by comparing it with lexical resources and by human judgment. The resource was used for evaluating two distributional thesauri: one built from lemmata and the other from surface forms. The evaluation of thesauri demonstrated that the use of lemmata is slightly more accurate than the use surface forms for building distributional thesauri. B\(^2\)SG is readily available for download (http://​www.​inf.​ufrgs.​br/​pln/​resource/​B2SG.​zip).

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700