用户名: 密码: 验证码:
Cross-lingual sentiment classification with stacked autoencoders
详细信息    查看全文
  • 作者:Guangyou Zhou ; Zhiyuan Zhu ; Tingting He…
  • 关键词:Sentiment classification ; Cross ; lingual ; Stacked autoencoder
  • 刊名:Knowledge and Information Systems
  • 出版年:2016
  • 出版时间:April 2016
  • 年:2016
  • 卷:47
  • 期:1
  • 页码:27-44
  • 全文大小:957 KB
  • 参考文献:1.Banea C, Mihalcea R, Wiebe J, Hassan S (2008) Multilingual subjectivity analysis using machine translation. In: Proceedings of the conference on empirical methods in natural language processing, Honolulu, Hawaii, pp 127–135
    2.Bespalov D, Bai B, Qi Y, Shokoufandeh A (2011) Sentiment classification based on supervised latent N-gram analysis. In: Proceedings of the 20th ACM international conference on information and knowledge management, Glasgow, Scotland, UK, pp 375–382
    3.Baccianella S, Esuli A, Sebastiani F (1996) SentiWordNet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Language resources and evaluation
    4.Choi Y, Cardie C (2008) Learning with compositional semantics as structural inference for subsentential sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing, Honolulu, Hawaii, pp 793–801
    5.Duh K, Fujino A, Nagata M (2011) Is machine translation ripe for cross-lingual sentiment classification? In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, OR, pp 429–433
    6.Goldberg B, Zhu X (2006) Seeing stars when there aren’t many stars: graph-based semi-supervised learning for sentiment categorization? In: Proceedings of the first workshop on graph based methods for natural language processing, Stroudsburg, PA, USA, pp 45–52
    7.Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the twenty-eight international conference on machine learning
    8.Joachims T (1999) Making large-scale support vector machine learning practical. In: Advances in kernel methods, Cambridge, MA, pp 169–184
    9.Kim J, Li J, Lee J (2009) Discovering the discriminative views: measuring term weights for sentiment analysis. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, Suntec, Singapore, pp 253–261
    10.Klementiev A, Titov I, Bhattarai B (2012) Inducing crosslingual distributed representations of words. In: Proceedings of the international conference on computational linguistics, Bombay, India
    11.Lafferty J (2001) Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML, Morgan Kaufmann, pp 282–289
    12.Li S, Wang Z, Zhou G, Lee S (2011) Semi-supervised learning for imbalanced sentiment classification. In: Proceedings of the twenty-second international joint conference on artificial intelligence. Catalonia, Spain, Barcelona, pp 1826–1831
    13.Liu B (2012) Sentiment analysis and opinion mining. In: Synthesis lectures on human language technologies
    14.Lu B, Tan C, Cardie C, Tsou K (2011) Joint bilingual sentiment classification with unlabeled parallel corpora. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies, Portland, OR, pp 320–330
    15.Maas L, Daly E, Pham T, Huang D, Ng Y, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the association for computational linguistics, Portland, OR, pp 142–150
    16.Mejova Y, Padmini S (2011) Exploring feature definition and selection for sentiment classifiers. In: Proceedings of the fifth international AAAI conference on webblogs and social media
    17.Meng X, Wei F, Liu X, Zhou M, Xu G, Wang H (2012) Cross-lingual mixture model for sentiment classification. In: Proceedings of the 50th annual meeting of the association for computational linguistics, Jeju Island, Korea, pp 572–581
    18.Munteanu S, Marcu D (2005) Improving machine translation performance by exploiting non-parallel corpora. Comput Linguist 31(4):477–504CrossRef
    19.Nakagawa T, Inui K, Kurohashi S (2010) Dependency tree-based sentiment classification using CRFs with hidden variables. In: The 2010 annual conference of the North American chapter of the association for computational linguistics, Los Angeles, CA, pp 786–794
    20.Ng V, Dasgupta S, Arifin S (2006) Examining the role of linguistic knowledge sources in the automatic identification and classification of reviews. In: Proceedings of the COLING/ACL on main conference poster sessions, Sydney, Australia, pp 611–618
    21.Och F, Ney H (2000) Improved statistical alignment models. In: Proceedings of the 38th annual meeting on association for computational linguistics, Hong Kong, pp 440–447
    22.Pan S, Ni X, Sun J, Yang Q, Chen Z (2010) Cross-domain sentiment classification via spectral feature alignment. In: Proceedings of the 19th international conference on World Wide Web, Raleigh, NC, USA, pp 751–760
    23.Pan J, Xue G, Yu Y, Wang Y (2011) Cross-lingual sentiment classification via Bi-view non-negative matrix tri-factorization. In: Proceedings of the 15th Pacific-Asia conference on advances in knowledge discovery and data mining, Shenzhen, China, pp 289–300
    24.Pang B, Lee L (2008) Opinion mining and sentiment analysis. Found Trends Inf Retr 2(1):1–135CrossRef
    25.Pang B, Lee L, Vaithyanathan S (2002) Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the ACL-02 conference on empirical methods in natural language processing, Stroudsburg, PA, USA, pp 79–86
    26.Pang B, Lee L (2004) A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts. In: Proceedings of the 42nd annual meeting on association for computational linguistics, Barcelona, Spain
    27.Peng W, Park D (2011) Generate adjective sentiment dictionary for social media sentiment analysis using constrained nonnegative matrix factorization. In: The international conference on weblogs and social media, Barcelona, Spain. The AAAI Press
    28.Prettenhofer P, Stein B (2010) Cross-language text classification using structural correspondence learning. In: Proceedings of the 48th annual meeting of the association for computational linguistics, Uppsala, Sweden, pp 1118–1127
    29.Seki Y, Evans D, Ku L, Chen H, Kando N, Lin C (2007) Overview of opinion analysis pilot task at NTCIR-6. In: Proceedings of the workshop meeting of the national institute of informatics test collection for information retrieval systems (NTCIR)
    30.Seki Y, Evans D, Ku L, Chen H, Kando N, Lin C (2007) Overview of multilingual opinion analysis task at NTCIR-7. In: Proceedings of NTCIR-7
    31.Seki Y, Evans D, Ku L, Chen H, Kando N, Lin C (2004) Mining multilingual opinions through classification and translation. In: AAAI Spring symposium on exploring attitude and affect in text
    32.Silberer C, Lapata M (2014) Learning grounded meaning representations with autoencoders. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, MD, pp 721–732
    33.Taboada M, Brooke J, Tofiloski M, Voll K, Stede M (2011) Lexicon-based methods for sentiment analysis. Comput Linguist 37(2):267–307CrossRef
    34.Tseng H (2005) A conditional random field word segmenter. In: Fourth SIGHAN workshop on Chinese language processing
    35.Turney D (2002) Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Proceedings of the 40th annual meeting on association for computational linguistics, Philadelphia, PA, pp 417–424
    36.Vincent P, Larochelle H, Bengio Y, Manzagol P (2008) Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th international conference on machine learning, Helsinki, Finland, pp 1096–1103
    37.Vikas S, Prem M (2008) Document-word co-regularization for semi-supervised sentiment analysis. In: Proceedings of the international conference on data mining
    38.Wan X (2008) Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the conference on empirical methods in natural language processing, Honolulu, Hawaii, pp 553–561
    39.Wan X (2009) Co-training for cross-lingual sentiment classification. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing, Suntec, Singapore, pp 235–243
    40.Wan X (2011) Bilingual co-training for sentiment classification of Chinese product reviews. Comput Linguist 37(3):587–616CrossRef
    41.Wiebe J, Cardie C (2005) Annotating expressions of opinions and emotions in language. In: Language resources and evaluation, language resources and evaluation (formerly computers and the humanities)
    42.Wu K, Wang X, Lu B (2008) Cross language text categorization using a bilingual lexicon. In: Proceedings of the third international joint conference on natural language processing
    43.Xia R, Zong C (2010) Exploring the use of word relation features for sentiment classification. In: Proceedings of the 23rd international conference on computational linguistics: posters, Beijing, China, pp 1336–1344
    44.Xiao M, Guo Y (2013) Semi-supervised representation learning for cross-lingual text classification. In: Proceedings of the conference on empirical methods on natural language processing, Seattle, USA, pp 1465–1475
    45.Yoshua B, Pascal L, Dan P, Hugo L (2011) Greedy layer-wise training of deep networks. In: Proceedings of the NIPS
    46.Yoshua B (2011) Learning deep architectures for AI. In: Foundations and trends in machine learning, Hanover, MA, USA, pp 1–127
    47.Zhou G, He T, Zhao J (2014) Bridge the language gap: learning distributed semantics for cross-lingual sentiment classification. In: Proceedings of the 3rd international conference on natural language processing and Chinese computing, Shenzhen, China, pp 138–149
    48.Zhou G, Zhao J, Zeng D (2014) Sentiment classification with graph co-regularization. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers, Dublin, Ireland, pp 1331–1340
    49.Zhou G, He T, Zhao J, Wu W (2015) A subspace learning framework for cross-lingual sentiment classification with partial parallel data. In: Proceedings of the international joint conference on artificial intelligence, Buenos Aires
    50.Zou Y, Socher R, Cer M, Manning D (2013) Bilingual word embeddings for phrase-based machine translation. In: Proceedings of the conference on empirical methods on natural language processing, Seattle, USA, pp 1393–1398
  • 作者单位:Guangyou Zhou (1)
    Zhiyuan Zhu (2)
    Tingting He (1)
    Xiaohua Tony Hu (1) (3)

    1. School of Computer, Central China Normal University, Wuhan, 430079, China
    2. Chinese Institute of Electronics, Beijing, 100036, China
    3. College of Computing and Informatics, Drexel University, Philadelphia, PA, 19104, USA
  • 刊物类别:Computer Science
  • 刊物主题:Information Systems and Communication Service
    Business Information Systems
  • 出版者:Springer London
  • ISSN:0219-3116
文摘
Cross-lingual sentiment classification is a popular research topic in natural language processing. The fundamental challenge of cross-lingual learning stems from a lack of overlap between the feature spaces of the source language data and the target language data. In this article, we propose a new model which uses stacked autoencoders to learn language-independent high-level feature representations for the both languages in an unsupervised fashion. The proposed framework aims to force the aligned input bilingual sentences into a common latent space, and the objective function is defined by minimizing the input and output vector representations as well as the distance of the common representations in the latent space. Sentiment classifiers trained on the source language can be adapted to predict sentiment polarity of the target language with the language-independent high-level feature representations. We conduct extensive experiments on English–Chinese sentiment classification tasks of multiple data sets. Our experimental results demonstrate the efficacy of the proposed cross-lingual approach.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700