Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Graph-Based Multimodal Music Mood Classification in Discriminative Latent Space

详细信息查看全文

关键词：Music mood classification ; Multimodal ; Graph learning ; Locality Preserving Projection ; Bag of sentences
刊名：Lecture Notes in Computer Science
出版年：2017
出版时间：2017
年：2017
卷：10132
期：1
页码：152-163
丛书名：MultiMedia Modeling
ISBN：978-3-319-51811-4
卷排序：10132

文摘

Automatic music mood classification is an important and challenging problem in the field of music information retrieval (MIR) and has attracted growing attention from variant research areas. In this paper, we proposed a novel multimodal method for music mood classification that exploits the complementarity of the lyrics and audio information of music to enhance the classification accuracy. We first extract descriptive sentence-level lyrics and audio features from the music. Then, we project the paired low-level features of two different modalities into a learned common discriminative latent space, which not only eliminates between modality heterogeneity, but also increases the discriminability of the resulting descriptions. On the basis of the latent representation of music, we employ a graph learning based multi-modal classification model for music mood, which takes the cross-modality similarity between local audio and lyrics descriptions of music into account for effective exploitation of correlations between different modalities. The acquired predictions of mood category for every sentence of music are then aggregated by a simple voting scheme. The effectiveness of the proposed method has been demonstrated in the experiments on a real dataset composed of more than 3,000 min of music and corresponding lyrics.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700