用户名: 密码: 验证码:
Learning to suggest questions in social media
详细信息    查看全文
  • 作者:Tom Chao Zhou (1)
    Michael Rung-Tsong Lyu (2) (3)
    Irwin King (2) (3)
    Jie Lou (4)

    1. Baidu Inc.
    ; Shenzhen ; China
    2. Shenzhen Key Laboratory of Rich Media Big Data Analytics and Applications
    ; Shenzhen Research Institute ; The Chinese University of Hong Kong ; Shenzhen ; China
    3. Department of Computer Science and Engineering
    ; The Chinese University of Hong Kong ; Shatin ; Hong Kong
    4. Department of Information Systems
    ; City University of Hong Kong ; Kowloon Tong ; Hong Kong
  • 关键词:Social media ; Online forum ; Community ; based Q&A ; Question suggestion ; Language model ; Topic modeling
  • 刊名:Knowledge and Information Systems
  • 出版年:2015
  • 出版时间:May 2015
  • 年:2015
  • 卷:43
  • 期:2
  • 页码:389-416
  • 全文大小:1,547 KB
  • 参考文献:1. Adamic LA, Zhang J et al (2008) Knowledge sharing and yahoo answers: everyone knows something. In: Proceedings of the 17th international conference on World Wide Web. ACM
    2. Agichtein E, Lawrence S et al (2001) Learning search engine specific query transformations for question answering. In: Proceedings of the 10th international conference on World Wide Web. ACM
    3. Agichtein, E, Liu, Y (2009) Modeling information-seeker satisfaction in community question answering. ACM Trans Knowl Discov Data (TKDD) 3: pp. 10
    4. Berger A, Caruana R et al (2000) Bridging the lexical chasm: statistical approaches to answer-finding. In: Proceedings of the 23rd annual international ACM SIGIR conference on research and development in information retrieval. ACM
    5. Berger A, Lafferty J (1999) Information retrieval as statistical translation. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. ACM
    6. Bernhard D, Gurevych I (2009) Combining lexical semantic resources with question and answer archives for translation-based answer finding. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: volume 2-volume 2. Association for Computational Linguistics
    7. Bian J, Liu Y et al (2008) Finding the right facts in the crowd: factoid question answering over social media. In: Proceedings of the 17th international conference on World Wide Web. ACM
    8. Blei, DM, Ng, AY (2003) Latent dirichlet allocation. J Mach Learn Res 3: pp. 993-1022
    9. Brown, PF, Cocke, J (1990) A statistical approach to machine translation. Comput Linguist 16: pp. 79-85
    10. Buckley C, Singhal A et al (1995) New retrieval approaches using SMART: TREC 4. In: Proceedings of the 4th text REtrieval conference (TREC-4)
    11. Buckley C, Voorhees EM (2004) Retrieval evaluation with incomplete information. In: Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval. ACM
    12. Burke, RD, Hammond, KJ (1997) Question answering from frequently asked question files: experiences with the faq finder system. AI Mag 18: pp. 57
    13. Cao X, Cong G et al (2010) A generalized framework of exploring category information for question retrieval in community question answer archives. In: Proceedings of the 19th international conference on World Wide Web. ACM
    14. Cao, X, Cong, G (2012) Approaches to exploring category information for question retrieval in community question-answer archives. ACM Trans Inf Syst (TOIS) 30: pp. 7 CrossRef
    15. Cao, Y, Duan, H (2011) Re-ranking question search results by clustering questions. J Am Soci Inf Sci Technol 62: pp. 1177-1187 CrossRef
    16. Cao Y, Duan H et al (2008) Recommending questions using the mdl-based tree cut model. In: Proceedings of the 17th international conference on World Wide Web. ACM
    17. Cong G, Wang L et al (2008) Finding question-answer pairs from online forums. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM
    18. Deerwester, SC, Dumais, ST (1990) Indexing by latent semantic analysis. JASIS 41: pp. 391-407 CrossRef
    19. Demner-Fushman, D, Lin, J (2007) Answering clinical questions with knowledge-based and statistical techniques. Comput Linguist 33: pp. 63-103 CrossRef
    20. Duan H, Cao Y et al (2008) Searching questions by identifying question topic and question focus. In: Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies
    21. Ferrucci, D, Brown, E (2010) Building Watson: an overview of the deepQA project. AI Mag 31: pp. 59-79
    22. Gazan, R (2011) Social Q&A. J Am Soc Inf Sci Technol 62: pp. 2301-2312 CrossRef
    23. Griffiths, TL, Steyvers, M (2004) Finding scientific topics. Proc Nat Acad Sci USA 101: pp. 5228-5235 CrossRef
    24. Harabagiu S, Moldovan D et al (2001) Answering complex, list and context questions with LCC鈥檚 question-answering server. In: Proceedings of the text retrieval conference for question answering (TREC 10)
    25. Heinrich G (2005) Parameter estimation for text analysis. Fraunhofer IGD
    26. Hofmann T (1999) Probabilistic latent semantic indexing. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval. ACM
    27. Huston S, Croft WB (2010) Evaluating verbose query processing techniques. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval. ACM
    28. Jeon J, Croft WB et al (2005) Finding semantically similar questions based on their answers. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval. ACM
    29. Jeon J, Croft WB et al (2005) Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM international conference on information and knowledge management. ACM
    30. Jijkoun V, de Rijke M (2005) Retrieving answers from frequently asked questions pages on the web. In: Proceedings of the 14th ACM international conference on information and knowledge management. ACM
    31. Kim, S, Oh, S (2009) Users鈥?relevance criteria for evaluating answers in a social Q&A site. J Am Soc Inf Sci Technol 60: pp. 716-727 CrossRef
    32. Li B, Liu Y et al (2008) CoCQA: co-training over questions and answers with an application to predicting question subjectivity orientation. In: Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics
    33. Lin, J, Katz, B (2006) Building a reusable test collection for question answering. J Am Soc Inf Sci Technol 57: pp. 851-861 CrossRef
    34. Lou, J, Fang, YL (2013) Contributing high quantity and quality knowledge to online Q&A communities. J Am Soc Inf Sci Technol 64: pp. 356-371 CrossRef
    35. Lou J, Lim KH et al (2011) Drivers of knowledge contribution quality and quantity in online question and answering communities. In: Proceedings of the 15th pacific conference on information systems
    36. Lou J, Lim KH et al (2012) Knowledge contribution in online question and answering communities: effects of groups membership. In: Proceedings of the 2012 international conference on information systems
    37. Manning, CD, Raghavan, P (2008) Introduction to information retrieval. Cambridge University Press, Cambridge CrossRef
    38. Miller, GA (1995) WordNet: a lexical database for English. Commun ACM 38: pp. 39-41 CrossRef
    39. Mitra M, Singhal A et al (1998) Improving automatic query expansion. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval. ACM
    40. Och, FJ, Ney, H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29: pp. 19-51 CrossRef
    41. Ofoghi, B, Yearwood, J (2009) The impact of frame semantic annotation levels, frame-alignment techniques, and fusion methods on factoid answer processing. J Am Soc Inf Sci Technol 60: pp. 247-263 CrossRef
    42. Phan XH, Nguyen LM et al (2008) Learning to classify short and sparse text and web with hidden topics from large-scale data collections. In: Proceedings of the 17th international conference on World Wide Web. ACM
    43. Pomerantz, J (2005) A linguistic analysis of question taxonomies. J Am Soc Inf Sci Technol 56: pp. 715-728 CrossRef
    44. Porter, MF (1980) An algorithm for suffix stripping. Program 14: pp. 130-137 CrossRef
    45. Qu, B, Cong, G (2012) An evaluation of classification models for question topic categorization. J Am Soc Inf Sci Technol 63: pp. 889-903 CrossRef
    46. Raban, DR (2009) Self-presentation and the value of information in Q&A websites. J Am Soc Inf Sci Technol 60: pp. 2465-2473 CrossRef
    47. Radev, D, Fan, W (2005) Probabilistic question answering on the web. J Am Soc Inf Sci Technol 56: pp. 571-583 CrossRef
    48. Radev, DR, Libner, K (2002) Getting answers to natural language questions on the web. J Am Soc Inf Sci Technol 53: pp. 359-364 CrossRef
    49. Ramage D, Heymann P et al (2009) Clustering the tagged web. In: Proceedings of the second ACM international conference on web search and data mining. ACM
    50. Ramos J (2003) Using tf-idf to determine word relevance in document queries. In: Proceedings of the first instructional conference on machine learning
    51. Riezler S, Vasserman A et al (2007) Statistical machine translation for query expansion in answer retrieval. In: Proceedings of the 45th annual meeting of the Association for Computational Linguistics
    52. Rosen-Zvi, M, Chemudugunta, C (2010) Learning author-topic models from text corpora. ACM Trans Inf Syst (TOIS) 28: pp. 4 CrossRef
    53. Rosenbaum, H, Shachaf, P (2010) A structuration approach to online communities of practice: the case of Q&A communities. J Am Soc Inf Sci Technol 61: pp. 1933-1944 CrossRef
    54. Shah C, Kitzie V (2012) Social Q&A and virtual reference鈥揷omparing apples and oranges with the help of experts and users. J Am Soc Inf Sci Technol 63(10):2020鈥?036
    55. Liu GZ (1998) Automated information retrieval: theory and methods. J Am Soc Inf Sci 49(10):953鈥?55
    56. Shrestha L, McKeown K (2004) Detection of question-answer pairs in email conversations. In: Proceedings of the 20th international conference on computational linguistics. Association for Computational Linguistics
    57. Shtok A, Dror G et al (2012) Learning from the past: answering new questions with past answers. In: Proceedings of the 21st international conference on World Wide Web. ACM
    58. Soricut R, Brill E (2004) Automatic question answering: Beyond the factoid. In: Proceedings of the HLT-NAACL
    59. Sparck Jones, K (1971) Automatic keyword classification for information retrieval. Butterworths, London
    60. Voorhees E, Tice DM (1999) The TREC-8 question answering track evaluation. In: Proceedings of the eighth text retrieval conference (TREC-8). http://trec.nist.gov/pubs/trec8/t8_proceedings.html
    61. Wang K, Ming Z et al (2009) A syntactic tree matching approach to finding similar questions in community-based qa services. In: Proceedings of the 32nd international ACM SIGIR conference on research and development in information retrieval. ACM
    62. Wei X, Croft WB (2006) LDA-based document models for ad-hoc retrieval. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval
    63. Wu, CH, Yeh, JF (2005) Domain-specific FAQ retrieval using independent aspects. ACM Trans Asian Lang Inf Process (TALIP) 4: pp. 1-17 CrossRef
    64. Xue X, Jeon J et al (2008) Retrieval models for question and answer archives. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval. ACM
    65. Yahoo! Yahoo! Webscope dataset, ydata-yanswers-all-questions-v1 \(\_0\) . http://research.yahoo.com/Academic_Relations
    66. Zhai, C, Lafferty, J (2004) A study of smoothing methods for language models applied to information retrieval. ACM Trans Inf Syst (TOIS) 22: pp. 179-214 CrossRef
    67. Zhou TC, Lin CY et al (2011) Learning to suggest questions in online forums. In: Proceedings of the 25th AAAI conference on artificial intelligence
    68. Zhou TC, Lyu MR et al (2012) A classification-based approach to question routing in community question answering. In: Proceedings of the 21st international conference companion on World Wide Web. ACM
    69. Zhou TC, Ma H et al (2009) Tagrec: leveraging tagging wisdom for recommendation. Computational Science and Engineering, 2009. CSE鈥?9. International Conference on IEEE
    70. Zhou TC, Ma H et al (2010) UserRec: a user recommendation framework in social tagging systems. AAAI
    71. Zhou TC, Si X et al (2012) A data-driven approach to question subjectivity identification in community question answering. In: Proceedings of the twenty-sixth AAAI conference on artificial intelligence
  • 刊物类别:Computer Science
  • 刊物主题:Information Systems and Communication Service
    Business Information Systems
  • 出版者:Springer London
  • ISSN:0219-3116
文摘
Social media systems with Q&A functionalities have accumulated large archives of questions and answers. Two representative types are online forums and community-based Q&A services. To enable users to explore the large number of questions and answers in social media systems effectively, it is essential to suggest interesting items to an active user. In this article, we address the problem of question suggestion, which targets at suggesting questions that are semantically related to a queried question. Existing bag-of-words approaches suffer from the shortcoming that they could not bridge the lexical chasm between semantically related questions. Therefore, we present a new framework, and propose the topic-enhanced translation-based language model (TopicTRLM), which fuses both the lexical and latent semantic knowledge. This fusing enables TopicTRLM to find semantically related questions to a given question even when there is little word overlap. Moreover, to incorporate the answer information into the model to make the model more complete, we also propose the topic-enhanced translation-based language model with answer ensemble. Extensive experiments have been conducted with real-world datasets. Experimental results indicate our approach is very effective and outperforms other popular methods in several metrics.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700