用户名: 密码: 验证码:
ILATalk: a new multilingual text-to-speech synthesizer with machine learning
详细信息    查看全文
  • 作者:Saleh M. Abu-Soud
  • 关键词:Text ; to ; speech ; Synthesizer ; Multilingual ; NLP ; Phonemes ; Inductive learning ; ILA
  • 刊名:International Journal of Speech Technology
  • 出版年:2016
  • 出版时间:March 2016
  • 年:2016
  • 卷:19
  • 期:1
  • 页码:55-64
  • 全文大小:774 KB
  • 参考文献:Abu-Soud, S. (1997). “A framework for integrating decision support systems and expert systems with machine learning”. In Proceeding of the 10th International Conference on Industrial and Engineering Applications of AI and ES.
    Hassan M. H. & Abu-Soud, S. (2000). “A parallel inductive learning algorithm”. AMSE Journal, France, Dec 2000.
    Abu-Soud, S. M., & Al-Ibrahim, A. (2009). DRILA: A distributed relational inductive learning algorithm. WSEAS Transactions on Computers, 8(6), 988–999.
    Abu-Soud, S. M., & Tolun, M. R. (1999a). “DCL: a disjunctive learning algorithm for rule extraction”. In Multiple approaches to intelligent systems (pp. 669–678). Berlin Heidelberg: Springer.
    Abu-Soud, S. M., & Tolun, M. R. (1999b) “A disjunctive concept learning algorithm for rule generation”. In Applied Informatics-Proceedings.
    Bakiri, G. & Dietterich, T. G. (1993). “Performance comparison between human engineered and machine learned letter-to-sound rules for English: A machine learning success story”. In Proceedings of the 18th International Conference on the Applications of Computer and Statistics to Science and Society, Cairo, Egypt.
    Bill, B. (1990). “The mothertongue: English and how it got that way”.
    Chen, S. H., Hwang, S. H., & Wang, Y.-R. (1998). An RNN-based prosodic information synthesizer for Mandarin text-to-speech. IEEE Transactions on Speech and Audio Processing, 6(3), 226–239.CrossRef
    Dietterich, T. G., Hild, H. & Bakiri, G. (1990). “A comparative study of ID3 and backpropagation for English text-to-speech mapping”. ML.
    Dutoit, T. (1997). High-quality text-to-speech synthesis: An overview. Journal of Electrical and Electronics Engineering Australia, 17, 25–36.
    Golding, A. R., & Rosenbloom, P. S. (1996). Improving accuracy by combining rule-based and case-based reasoning. Artificial Intelligence, 87(1), 215–254.CrossRef
    Hirschberg, J., & Prieto, P. (1996). Training intonational phrasing rules automatically for English and Spanish text-to-speech. Speech Communication, 18(3), 281–290.CrossRef
    http://​www.​eupedia.​com/​forum/​threads/​29850-Number-of-phonemes-(vowels-consonants)-by-language-in-Europe . Accessed June 28, 2015.
    Huang, X., et al. (1996). “Whistler: A trainable text-to-speech system”. In Proceedings of the Fourth International Conference on Spoken Language, 1996. ICSLP 96. Vol. 4. IEEE.
    Quinlan, J. R. (1983). Learning efficient classification procedures and their application to chess end games. In R. S. Michalski, J. G. Carbonell, & T. M. Mitchell (Eds.), Machine learning, an artificial intelligence approach (pp. 463–482). Tioga: Palo Alto, CA.
    Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing (Vol. 1). Cambridge, MA: MIT Press.
    Sasirekha, D., & Chandra, E. (2012). Text to speech: A simple tutorial. International Journal of Soft Computing and Engineering (IJSCE) ISSN: 2231-2307, 2(1), March 2012.
    Sejnowski, T. L., & Rosenberg, C. R. (1987). Parallel networks that learn to pronounce English text. Complex Systems, 1, 145–168.
    Stas, T., David, M., Slava, Sh., & Zvi, K. (2010). A hybrid text-to-speech system that combines concatenative and statistical synthesis units. CCIT Report #777, Irwin and Joan Jacobs center for communication and information technologies, Haifa 3200, Nov 2010.
    Tolun, M. R., & Abu-Soud S. M. (1998). An Inductive Learning Algorithm for Production Rule Discovery. The International Journal of Expert Systems with Applications, 14(3), 361–370.CrossRef
    Tolun, M. R., Sever, H., Uludag, M., & Abu-Soud, S. M. (1999). ILA-2: An inductive learning algorithm for knowledge discovery. Cybernetics & Systems, 30(7), 609–628.CrossRef
  • 作者单位:Saleh M. Abu-Soud (1)

    1. Department of Software Engineering, Princess Sumaya University for Technology, Amman, 11941, Jordan
  • 刊物类别:Engineering
  • 刊物主题:Signal,Image and Speech Processing
    Social Sciences
    Artificial Intelligence and Robotics
  • 出版者:Springer Netherlands
  • ISSN:1572-8110
文摘
In this paper, a new multilingual text-to-speech system based on inductive learning has been developed. This system is called ILATalk. It is composed of three phases: the analysis phase, learning phase, and synthesis phase. It can accept any language; all what is needed is to store the data set that contains the training examples that are generated from a representative and selected subset of words from the required language in addition to the associated phonemes of the language in data tables to be used as input to the system. The system has been thoroughly tested with many sets of experiments with various parameters and sizes, and compared with two known approaches: ID3 and NN Backpropagation. The results obtained showed that ILATalk produces correct phonemes with high accuracy and out-performs these algorithms in most cases. Keywords Text-to-speech Synthesizer Multilingual NLP Phonemes Inductive learning ILA

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700