用户名: 密码: 验证码:
Multi-label Feature Selection via Information Gain
详细信息    查看全文
  • 作者:Ling Li (22)
    Huawen Liu (22) (23)
    Zongjie Ma (22)
    Yuchang Mo (22)
    Zhengjie Duan (22)
    Jiaqing Zhou (22)
    Jianmin Zhao (22)
  • 关键词:Multi ; label classification ; High dimension ; Feature selection Information gain
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8933
  • 期:1
  • 页码:345-355
  • 全文大小:205 KB
  • 参考文献:1. Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. Advances in Neural Information Processing Systems, pp. 681鈥?87 (2001)
    2. Srivastava, A.N., Zane-Ulman, B.: Discovering recurring anomalies in text reports regarding complex space systems. In: Aerospace Conference, pp. 3853鈥?862. IEEE (2005)
    3. Turnbull, D., Barrington, L., Torres, D., et al.: Semantic annotation and retrieval of music and sound effects. IEEE Transactions on Audio, Speech, and Language Processing聽16(2), 467鈥?76 (2008) CrossRef
    4. Spyromitros, E., Tsoumakas, G., Vlahavas, I.P.: An empirical study of lazy multilabel classification algorithms. In: Darzentas, J., Vouros, G.A., Vosinakis, S., Arnellos, A. (eds.) SETN 2008. LNCS (LNAI), vol.聽5138, pp. 401鈥?06. Springer, Heidelberg (2008) CrossRef
    5. Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Machine Learning聽39, 135鈥?68 (2000) CrossRef
    6. Cheng, W., Hullermeier, E.: Combining instance-based learning and logistic regression for multilabel classification. Machine Learning聽76, 211鈥?25 (2009) CrossRef
    7. Tsoumakas, G., Dimou, A., Spyromitros, E., Mezaris, V., Kompatsiaris, I., Vlahavas, I.: Correlation-based pruning of stacked binary relevance models for multi-label learning. In: Proceedings of the Workshop on Learning from Multi-Label Data (MLD 2009), pp. 101鈥?16. Springer Press, Berlin (2009)
    8. Liu, H., Motoda, H., Setiono, R., et al.: Feature Selection: An Ever Evolving Frontier in Data Mining. FSDM, 4鈥?3 (2010)
    9. Jolliffe, I.: Principal Component Analysis. Springer-Verlag, New York (1986) CrossRef
    10. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn. Spring (2010)
    11. Zhang, Y., Zhou, Z.H.: Multilabel dimensionality reduction via dependence maximization. ACM Transactions on Knowledge Discovery from Data (TKDD)聽4(3), 14 (2010) CrossRef
    12. Fisher, R.: The use of multiple measurements in taxonomic problems. Annals of Eugenics聽7, 179鈥?88 (1936) CrossRef
    13. Spolaor, N., Cherman, E.A., Monard, M.C.: Using ReliefF for Multilabel feature selection. In: Conferencia Latinoamericana de Informatica, pp. 960鈥?75 (2011)
    14. Lee, J., Kim, D.W.: Feature selection for multi-label classification using multivariate mutual information. Pattern Recognition Letters聽34(3), 349鈥?57 (2013) CrossRef
    15. Saeys, Y., Inza, I., Larranaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics聽23, 2507鈥?517 (2007) CrossRef
    16. Zhang, Y., You, L., Chen, J.X.: Feature selection for multi-label data by using simulated annealing. Computer Engineering and Design聽32(7), 2494鈥?500 (2011)
    17. You, M., Liu, J., Li, G.Z., et al.: Embedded feature selection for multi-label classification of music emotions. International Journal of Computational Intelligence Systems聽5(4), 668鈥?78 (2012) CrossRef
    18. Shao, H., Li, G.Z., Liu, G.P., et al.: Symptom selection for multi-label data of inquiry diagnosis in traditional Chinese medicine. Science China Information Sciences聽56(5), 1鈥?3 (2013) CrossRef
    19. Qu, H., Zhang, S., Liu, H., et al.: A multi-label classification algorithm based on label-specific features. Wuhan University Journal of Natural Sciences聽16(6), 520鈥?24 (2011) CrossRef
    20. Kong, D., Ding, C., Huang, H., et al.: Multi-label relieff and f-statistic feature selections for image annotation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2352鈥?359. IEEE (2012)
    21. Cover, T.M., Thomas, J.A.: Elements of information theory. John Wiley and Sons (2012)
    22. Brown, G.: A new perspective for information theoretic feature selection. International Conference on Artificial Intelligence and Statistics, 49鈥?6 (2009)
    23. Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.: Multi-label classification of music into emotions. In: 9th International Conference on Music Information Retrieval (ISMIR 2008), Philadelphia, pp. 325鈥?30 (2008)
    24. Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on. Knowledge and Data Engineering聽17(4), 491鈥?02 (2005) CrossRef
    25. Zhang, M.L., Pena, J.M., Robles, V.: Feature selection for multi-label naive Bayes classification. Information Sciences聽179(19), 3218鈥?229 (2009) CrossRef
    26. Pudil, P., Novovicov, J., Kittler, J., et al.: Floating search methods in feature selection. Pattern recognition letters聽15(11), 1119鈥?125 (1994) CrossRef
    27. Ronen, M., Jacob, Z.: Using simulated annealing to optimize feature selection problem in marketing applications. European Journal of Operational Research聽171(3), 842鈥?58 (2006) CrossRef
    28. Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm, Feature extraction. Construction and Selection, pp. 117鈥?36. Springer, US (1998) CrossRef
    29. Zhang, M.-L., Zhou, Z.-H.: ML-kNN: a lazy learning approach to multi-label learning. Pattern Recognition聽40(7), 2038鈥?048 (2007) CrossRef
  • 作者单位:Ling Li (22)
    Huawen Liu (22) (23)
    Zongjie Ma (22)
    Yuchang Mo (22)
    Zhengjie Duan (22)
    Jiaqing Zhou (22)
    Jianmin Zhao (22)

    22. Department of Computer Science, Zhejiang Normal University, China
    23. NCMIS, Academy of Mathematics and Systems Science, CAS, China
  • ISSN:1611-3349
文摘
Multi-label classification has gained extensive attention recently. Compared with traditional classification, multi-label classification allows one instance to associate with multiple labels. The curse of dimensionality existing in multi-label data presents a challenge to the performance of multi-label classifiers. Multi-label feature selection is a powerful tool for high-dimension problem. However, the existing feature selection methods are unable to take both computational complexity and label correlation into consideration. To address this problem, a new approach based on information gain for multi-label feather selection (IGMF) is presented in this paper. In the process of IGMF, Information gain between a feature and label set is exploited to measure the importance of the feature and label corrections. After that, the optimal feature subset are obtained by setting the threshold value. A series of experimental results show that IGMF can promote performance of multi-label classifiers.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700