基于局部语义概念表示的图像场景分类技术研究

英文题名：Research on Local Semantic Concept Representation Based Image Scene Classification Technology
作者：张瑞杰
论文级别：博士
学科专业名称：信号与信息处理
中文关键词：图像场景分类 ; 视觉词典 ; 潜在语义分析 ; 精确位置敏感哈希 ; 多核学习 ; 非负稀疏局部线性编码 ; Fisher判别分析 ; 概率潜在语义分析模型 ; 多尺度信息
英文关键词：Image Scene Classification ; Bag of Visual Words ; Latent Semantic Indexing ; Exact
英文关键词：Euclidean Locality Sensitive Hashing ; Multiple Kernel Learning ; Non-Negative Sparse Locally
英文关键词：Coding ; Fisher Discriminative Analysis ; probabilistic Latent Semantic Analysis ; Multi-Scale
英文关键词：Information
学位年度：2013
导师：胡国恩
学科代码：081002
学位授予单位：解放军信息工程大学
论文提交日期：2013-04-15

摘要

随着计算机技术、通信技术和互联网技术的迅速普及和应用,数字图像的规模呈爆炸式增长。面对数量巨大的图像数据,如何让计算机自动“理解”图像,实现对海量图像资源快速而有效的分类管理成为图像研究领域的一个亟需解决的重要问题。图像场景分类根据给定的一组语义类别对图像数据库进行自动标注,它能够很好地支持基于语义的图像分类与检索,同时也可以为目标识别等更高层次的图像理解提供有效的上下文语义信息。图像场景分类的核心问题是如何消除底层特征和高层语义之间的“语义鸿沟”,而通过提取图像的局部不变特征,采用局部语义概念表示方法是解决上述问题的一个重要研究思路。本文将基于局部语义概念表示的图像场景分类技术作为主要研究对象,在提取图像局部特征的基础上,根据映射方式的不同,分别研究了基于视觉词典模型的局部语义概念表示、基于稀疏编码模型的局部语义概念表示和基于语义主题模型的局部语义概念表示方法,进而结合机器学习方法实现图像场景分类。本文的创新点和主要贡献包括以下五个方面：
     1.针对视觉词典模型中视觉单词的同义性和歧义性问题,提出了一种基于LSI和软加权的图像场景分类算法。首先,利用潜在语义索引(Latent Semantic Indexing, LSI)技术挖掘不同视觉单词间的内在语义关联,对视觉词典进行降维,构造更紧致的视觉词典；然后,采用软加权方式实现局部特征点到视觉单词之间的映射,将局部特征点按照不同的权重映射到多个与之近邻的视觉单词,并统计视觉单词在图像中的出现频次,构造出视觉词汇分布直方图作为图像的内容表示；最后利用支持向量机(Support Vector Machine, SVM)实现图像场景分类。实验结果表明,基于LSI和软加权的图像场景分类算法能够有效克服视觉单词的同义性和歧义性问题,进而改善场景分类性能。
     2.针对场景的类内差异性问题,提出了一种基于E2LSH-MKL的图像场景分类算法。首先,将精确位置敏感哈希(Exact Euclidean Locality Sensitive Hashing, E2LSH)算法用于聚类,构造视觉词典并生成基于E2LSH的视觉词汇分布直方图作为图像的内容表示；然后,将E2LSH哈希算法与非线性多核学习(Multiple Kernel Learning, MKL)方法相结合,构造非线性非平稳的多核分类器E2LSH-MKL; E2LSH-MKL先利用Hadamard内积实现对不同核函数的非线性加权,以充分利用不同核函数之间交互得到的信息；再利用E2LSH哈希算法将原始图像集聚类为若干图像子集,并根据不同核函数对各图像子集的相对贡献大小赋予各自不同的核权重,从而实现多核的非平稳加权以提高分类器性能；最后,结合图像的基于E2LSH的视觉词汇分布直方图表示和E2LSH-MKL分类器实现场景分类。实验结果表明,基于E2LSH-MKL的图像场景分类算法优于现有的几种多核学习方法,对于解决场景的类内差异性问题是有效的。
     3.针对稀疏编码模型中图像的空间信息丢失及稀疏表示向量判别性弱的问题,提出了一种基于Fisher别稀疏编码的图像场景分类算法。首先,构建局部特征点的非负稀疏局部线性编码,利用近邻视觉词汇重构局部特征点,以有效利用图像的空间信息；然后,在非负稀疏局部线性编码的基础上,加入Fisher判别约束准则,构造基于Fisher判别约束的非负稀疏局部线性编码模型,以获得图像的判别稀疏向量表示,从而提高图像稀疏表示向量的判别性,使得相同类别图像的稀疏表示系数距离更近,而不同类别图像的稀疏表示系数距离更远,增强稀疏系数的空间可分性,提高图像稀疏表示的分类能力；最后,结合SVM分类器实现场景分类。实验结果表明,基于Fisher判别稀疏编码的图像场景分类算法在利用图像空间信息的同时着眼于寻找图像的判别稀疏向量表示,分类性能优于现有的几种稀疏编码方法,因此更有利于场景分类任务。
     4.针对概率潜在语义分析(probabilistic Latent Semantic Analysis, pLSA)模型中主题数的确定问题,提出了一种基于密度的最优pLSA模型主题数选择算法,该算法根据主题之间相似度最小时模型最优的理论,采用基于密度的迭代算法自适应地寻找pLSA模型的最优主题数。实验结果表明,基于密度的最优pLSA模型主题数选择算法可以在不需要人工干预的情况下,用相对少的迭代,自动找到最优的主题结构。
     5.为有效利用图像的多尺度信息和上下文语义信息,提出了一种基于多尺度上下文语义信息的图像场景分类算法。首先,对图像进行多尺度分解,从图像的多个尺度中提取不同粒度的视觉信息；其次,利用基于密度的自适应选择算法确定最优pLSA模型主题结构；然后,利用pLSA模型分析图像块之间的语义共生概率,并结合Markov随机场(Markov Random Field, MRF)共同挖掘图像块的上下文语义共生信息；最后,加权连接不同尺度上的图像特征构建图像的多尺度直方图表示,进而结合SVM分类器实现图像场景分类。实验结果表明,该算法能够有效利用图像的多尺度信息和上下文语义信息,从而提高场景分类性能。
In recent years, with the rapid development of computer technology, communication technology and the Internet technology, multimedia data grows explosively and a massive information environment is generated. Facing the massive image data, how to make the computers automatically understand those images and classify them into different semantic categories according to human cognition, then classify and manage the huge amounts of image resources quickly and effectively becomes an important problem urgently needed to be solved in the field of image research. Image scene classification automatically annotates images with a group of semantic labels according to the semantic content included in the whole images or image regions. Image scene classification can support semantic based image analysis and retrieval, meanwhile helps to provide effective context for image understanding on higher layer. The key problem of image scene classification is how to eliminate the semantic gap between low-level visual features and high-level semantic concept. By extracting the image local invariant features, employ the local semantic concept representation is an important research idea. This paper mainly researches on local semantic concept representation based image scene classification technology. Based on extracting the local features of images, this paper respectively researches on visual vocabulary model based, sparse coding model based and semantic topic model based image scene classification technology. The novelty and main contributions are listed in the following five aspects:
     1. To overcome the synonymy and polysemy problem of visual words, this paper proposes a LSI and soft-weighting based image scene classification algorithm. Firstly, latent semantic indexing technology is employed to mine the latent semantic relationship of visual words, which conducts dimensionality reduction on the large-scale visual vocabulary to obtain the compact semantic visual vocabulary. Secondly, a soft-weighting scheme is implemented to realize the mapping of local features to visual words, which maps feature points to multiple neighbor visual words with different weights. Then, make statistics on visual words' appearance frequency in the image and construct the visual vocabulary distribution histogram representation. Finally, SVM classifier is utilized to perform scene classification. Experimental results demonstrate that the novel algorithm can effectively solve the problem of visual word synonymy and polysemy and improve scene classification performance.
     2. Aim to the intra-class diversity problem of images, this paper proposes an E2LSH-MKL based image scene classification algorithm. Firstly, E2LSH algorithm is employed to perform clustering to construct visual dictionaries and produce E2LSH based visual vocabulary distribution histogram representation. Secondly, by combining the advantages of non-linear multiple kernel combination methods, a nonlinear and non-stationary multiple kernel learning method—E2LSH-MKL is constructed. E2LSH-MKL utilizes Hadamard product to realize nonlinear combination of multiple different kernels in order to make full use of information generated from the nonlinear interaction of different kernels. Meanwhile, E2LSH-MKL employs E2LSH based clustering algorithm to group images into sub clusters, then assigns cluster-related weighting of multiple kernels weights according to relative contributions of different kernels on each image subset thereby realizing non-stationary weighting of multiple kernels to improve learning performance. Finally, the E2LSH based image visual vocabulary distribution histogram representation and E2LSH-MKL classifier are combined to perform image scene classification. Experimental results demonstrate that E2LSH-MKL based image scene classification algorithm performs superior to other related multiple kernel learning algorithms and is effective in resolving the intra-diversity problem of images.
     3. To Overcome the drawbacks of spatial information lack and weak discrimination, this paper propose an image scene classification algorithm based on sparse coding with fisher discriminative criterion constraint. Firstly, the non-negative sparse locally linear coding is constructed to encode the local features with their neighbor visual vocabularies, thus to make full use of images'spatial information. Secondly, on the basis of the non-negative sparse locally linear coding, fisher discriminative criterion constraint is added to construct a non-negative sparse locally linear coding model with fisher discriminative criterion constraint, thus to obtain the discriminative sparse representation of images. The novel model can promote the spatial separability of sparse coefficients and enforce the classification capability of images'sparse representation. Finally, SVM classifier is combined to perform scene classification. Experimental results show that our algorithm efficiently utilizes spatial information of images and incline to seek images' discrimination representations, thus performs superior to other related algorithms and is more suitable for image classification tasks.
     4. An appropriate number of topics in pLSA model is important but difficult to select, in order to adaptively select the best number of topics and inspired by the theorem that the model reaches optimum as the average similarity among topics reaches minimum. This paper proposes a method of adapatively selecting the best pLSA model based on density. Experimental results demonstrate that this algorithm can achieve the performance matching the best of pLSA without manually tuning the number of topics.
     5. To efficiently utilize image multi-scale and contextual semantic information, this paper proposes a novel image scene classification algorithm based on multi-scale and contextual semantic information. Firstly, Images are decomposed into variant scales and diverse visual details are extracted from different scale layers. Secondly, a density-based adaptive selection method is employed to choose the best topics number for probabilistic latent semantic analysis model. Then, the pLSA model and Markov random field are combined to mine the contextual semantic co-occurrence information of image patches, thus to construct more accurate visual words. Finally, make statistics on the frequency of visual words in diverse scale layer and linearly combine them to form a multi-scale histogram as the image representation which is subsequently used in scene classification with SVM classifier. The experimental results demonstrate that our novel algorithm effectively utilizes the multi-scale and contextual semantic information of images and improves image scene classification performance.

引文

[1]解文杰.基于中层语义表示的图像场景分类研究[D].北京交通大学博士学位论文,2011.
    [2]Nicu S, Michael S L. Robust Color Indexing[A]. Proceeding of the 7th ACM International Conference on Multimedia[C], New York, USA,1999:239-242.
    [3]Stricker M, Orengo M. Similarity of Color Images[J]. In SPIE Storage and Retrieval for Image and Video Databases,1995,2420:381-392.
    [4]Smith J R, Chang S. Tools and Techniques for Color Image Retrieval[J], In SPIE Storage and Retrieval for Image and Video Databases,1996,2670:426-437.
    [5]Haralick R M. Statistical and Structural Approaches to Texture[J]. Proceeding of the IEEE.1979,67(5): 786-804.
    [6]Haralick R M, Shanmugam K, Dinstein I. Textural Features for Image Classification[J]. IEEE Transactions on Systems, Man and Cybernetics,1973, SMC-3(6):610-621.
    [7]Tamura H, Mori S, Yamawaki T. Textural Features Corresponding to Visual Perception[J]. IEEE Transactions on Systems, Man and Cybernetics,1978,8(6):460-473.
    [8]Hu M K. Visual Pattern Recognition by Moment Invariants[J]. IEEE Transactions on Information Theory, 1962,8(2):179-187.
    [9]Lowe D G. Distinctive Image Features from Scale-Invariant Keypoints[J]. International Journal of Computer Vision,2004,60(2):91-110.
    [10]唐颖军.基于语义主题模型的图像场景分类研究[D].北京交通大学博士学位论文,2010.
    [11]Luo J, Savakisa A E, Singhal A. A Bayesian Network-Based Framework for Semantic Image Understanding[J]. Pattern Recognition,2005,38(6):919-934.
    [12]江悦,王润生,王程.采用上下文金字塔特征的场景分类[J].计算机辅助设计与图形学学报,2010,22(8)：1366-1373.
    [13]Oliva A, Torralba A. Modeling the Shape of the Scene:A Holistic Representation of the Spatial Envelope[J]. International Journal of Computer Vision,2001,42(3):145-175.
    [14]Torralba A, Fergus R, Freeman W T.80 Million Tiny Images:A Large Dataset for Non-Parametric Object and Scene Recognition[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence,2008, 30(11):1958-1970.
    [15]刘硕研.面向感知的图像场景及情感分类算法研究[D].北京交通大学博士学位论文,2011.
    [16]Szummer M, Picard R W. Indoor-Outdoor Image Classification[J]. Proceeding of International Workshop on Content-Based Access of Image and Video Database[C],1998:42-51.
    [17]Vailaya A, Figueiredo M, Jain A, et al. Content-based Hierarchical Classification of Vacation Images[A]. Proceeding of IEEE International Conference on Multimedia Computing and Systems[C],1999,1: 518-523.
    [18]Paek S, Chang S.F. A Knowledge Engineering Approach for Image Classification Based on Probabilistic Reasoning Systems[A]. Proceeding of IEEE International Conference on Multimedia and Expo[C], 2000,2:1133-1136.
    [19]Serrano N, Savakis A E, Luo J B. Improved Scene Classification Using Efficient Low-level Features and Semantic Cues[J]. Pattern Recognition,2004,37(9):1773-1784.
    [20]Shen J, Sheperd J, Ngu A H H. Semantic Sensitive Classification for Large Image Libraries[A]. Proceeding of 11th International Conference on Multimedia Modelling[C], Melbourne, Australia,2005: 340-345.
    [21]Benmokhtar R, Huet B, Berrani S A. Low-Level Feature Fusion Models for Soccer Scene Classification[A]. Proceeding of IEEE International Conference on Multimedia and Expo[C].2008: 1329-1332.
    [22]Josef S, Russell B C, Efros A A, et al. Discovering Objects and Their Location in Images[A]. Proceeding of 10th IEEE International Conference on Computer Vision[C],2005,1:370-377.
    [23]Fan J, Gao Y, Luo H. Statistical Modeling and Conceptualization of Natural Images[J]. Pattern Recognition,2005,38(6):865-885.
    [24]Fredembach C, Schroder M, Susstrunk S. Eigenregions for Image Classification [J]. IEEE Transaction on Pattern Analysis and Machine Intelligence,2004,26(12):1645-1649.
    [25]Bosch A, Munoz X. Object and Scene Classification:What does a supervised approach provide us?[A] Proceeding of 18th International Conference on Pattern Recognition[C], Hong Kong, China,2006,1: 773-777.
    [26]Sivic J, Zisserman A. Video Google:a Text Retrieval Approach to Object Matching in Videos[A]. Proceeding of 9th IEEE International Conference on Computer Vision[C], Nice, France,2003,2: 1470-1477.
    [27]Vogel J, Schiele B. Natural Scene Retrieval Based on a Semantic Modeling Step[A]. Proceeding of Image and Video Retrieval[C], Dublin, Ireland,2004:207-215.
    [28]Liu J G. Learning Semantic Features for Visual Recognition[D]. University of Central Florida,2011.
    [29]Jiang Y G, Ngo C W, Yang J. Towards Optimal Bag of Features for Object Categorization and Semantic Video Retrieval [A]. Proceeding of the 6th ACM International Conference on Image and Video Retrieval[C], Amsterdam, Netherlands,2007:494-501.
    [30]Philbin J, Chum O, Isard M, et al. Lost in Quantization:Improving Particular Object Retrieval in Large Scale Image Databases [A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C], Anchorage, USA,2008:1-8.
    [31]Van Gemert J C, Veenman C J, Smeulders A W M, et al. Visual Word Ambiguity[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,32(7):1271-1283.
    [32]Wang J Y, Li Y P, Zhang Y. Bag-of-Features Based Medical Image Retrieval via Multiple Assignment and Visual Words Weighting[J]. IEEE Transactions on Medical Imaging,2011,30(11):1996-2011.
    [33]唐峰,孙锬锋,蒋兴浩,陆欢.基于改进稀疏编码模型的图像分类算法[J].上海交通大学学报,2012,46(9)：1406-1410.
    [34]Olshausen B A, Field D J. Sparse Coding with an Over Complete Basis Set:A Strategy Employed by V1?[J]. Vision Research,1997,37(23):3311-3325.
    [35]Yang J, Yu K, Gong Y, Huang T. Linear Spatial Pyramid Matching using Sparse Coding for Image Classification[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C], 2009:1794-1801.
    [36]Wright J, Yang A Y, Ganesh A, et al. Robust Face Recognition via Sparse Representation[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence.2009,31(2):210-227.
    [37]Gabriel L O, Erickson R N, Antonio W V, et al. Sparse Spatial Coding:A Novel Approach for Efficient and Accurate Object Recognition[A]. Proceeding of IEEE International Conference on Robotics and Automation[C], Minnesota, USA,2012:2592-2598.
    [38]Yu K, Zhang T, Gong Y H. Nonlinear Learning Using Local Coordinate Coding[A]. In Advances in Neural Information Processing Systems[C],2009:2223-2231.
    [39]庄连生,高浩渊,刘超,俞能海.非负稀疏局部线性编码[J].软件学报,2011,22(2)：89-95.
    [40]Elhamifar E, Vidal R. Sparse Subspace Clustering[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C],2009:2790-2797.
    [41]Hoffman T. Probabilistic Latent Semantic Analysis[A]. Proceeding of the 15th Annual Conference on Uncertainty in Artificial Intelligence[J],1999:289-296.
    [42]Fei-fei Li, Perona P. A Bayesian Hierarchical Model for Learning Natural Scene Categories[A]. Proceeding of IEEE International Conference on Computer Vision and Pattern Recognition[C],2005,2: 524-531.
    [43]Horster E, Greif T, Lienhart R, et al. Comparing Local Feature Descriptors in pLSA-based Image Models[J].30th Annual Symposium of the German Association for Pattern Recognition.2008,5096: 446-455.
    [44]Emrah E, Nafiz A. Scene Classification Using Spatial Pyramid of Latent Topics[A]. Proceeding of 20 International Conference on Pattern Recognition[C],2010:3603-3606.
    [45]陈琳,卢湖川.基于ML-pLSA模型的目标识别算法[J].电子与信息学报,2011,33(12)：2909-2915.
    [46]Blei D M, Jordan M I. Modeling Annotated Data[A]. Proceeding of the 26th annual International ACM SIGIR Conference on Research and Development in Information Retrieval[C],2003:127-134.
    [47]郭乔进,李宁,杨育彬,武港山.LDA-CRF:一种基于概率图模型的目标检测方法[J].计算机研究与发展,2012,49(11)：2296-2304.
    [48]Lowe D G Object Recognition from Local Scale-Invariant Features[A]. Proceeding of 7th IEEE International Conference on Computer Vision[C],1999,2:1150-1157.
    [49]贾世杰,孔祥维.一种新的直方图核函数及在图像分类中的应用[J].电子与信息学报,2011,33(7)：1738-1742.
    [50]段菲,章毓晋.基于多尺度稀疏表示的场景分类[J].计算机应用研究,2012,29(10)：3938-3941.
    [51]Jegou H, Douze M, Schmid C. Packing Bag-of Features[A]. Proceeding of IEEE 12th International Conference on Computer Vision[C], Kyoto, Japan,2009:2357-2364.
    [52]Cao Y, Wang C H, Li Z W, et al. Spatial-Bag-of-Features[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C], San Francisco, USA,2010:3352-3359
    [53]Gemert J, Snoek C, Veenman C, et al. Comparing Compact Codebooks for Visual Categorization[J]. Computer Vision and Image Understanding.2010,114(4):450-462.
    [54]张雪风,张桂珍,刘鹏.基于聚类准则函数的改进K-means算法[J].计算机工程与应用,2011,47(11)：123-127.
    [55]Olshausen B A, Field D J. Emergence of Simple-Cell Receptive Field Properties by Learning a Sparse Code for Natural Images[J]. Nature,1996,381(6583):607-609.
    [56]Li S L, Huang D S, Zheng C H, et al. Image Feature Extraction Based on an Extended Non-negative Sparse Coding Neural Network Model[J]. Lecture Notes in Computer Science.2005,3497:807-812
    [57]Coates A, Lee H, Andrew Y N. An Analysis of Single-Layer Networks in Unsupervised Feature Learning[A]. Proceeding of the 14th International Conference on Artificial Intelligence and Statistics[C], 2011:215-223.
    [58]Rigamonti R, Brown M A, Lepetit V. Are Sparse Representation Really Relevant for Image Classification?[A]. Proceeding of IEEE International Conference on Computer Vision and Pattern Recognition[C],2011:1545-1552.
    [59]王媛媛.视频中人体行为识别的判别稀疏编码方法研究[D].国防科学技术大学硕士学位论文,2011.
    [60]Davis G, Mallat S, Avellaneda M. Adaptive Greedy Approximations[J]. Journal of Constructive Approximation,1997,13(1):57-98.
    [61]Chen S S, Donoho D L, Saunders M A. Atomic Decomposition by Basis Pursuit[J]. SIAM Review,2001, 43(1):129-159
    [62]Candes E J, Tao T. Decoding by Linear Programming[J]. IEEE Transactions on Information Theory, 2005,51(12):4203-4215.
    [63]Candes E J, Romberg J. Practical Signal Recovery from Random Projections[A]. Proceeding of SPIE Conference on Wavelet Applications in Signal and Image Processing[C],2005,5914:1-18.
    [64]Donoho D L. For Most Large Underdetermined Systems of Linear Equations, the Minimal LI-Norm Solution is also the Sparsest Solution[J]. Communications on Pure and Applied Mathematics,2004, 59(6):797-829.
    [65]Donoho D L. Neighborly Polytopes and Sparse Solution of Underdetermined Linear Equations[R]. Technical Report, Department of Statistics, Stanford University.2005.
    [66]Pati Y C, Rezaiifar R, Krishnaprasad P S. Orthogonal Matching Pursuit:Recursive Function Approximation with Applications to Wavelet Decomposition[A]. Proceeding of The 27th Asilomar Conference on Signals, Systems and Computers[C].1993,1:40-44.
    [67]Gorodnitski I. F, Rao B. D. Sparse Signal Reconstruction from Limited Data using FOCUSS:a Re-weighted Norm Minimization Algorithm[J]. IEEE Transactions on Signal Processing,1997,45(3): 600-616.
    [68]Donoho D. L., Tsaig Y., DroriI., Starck J. L. Sparse Solution of Underdetermined Linear Equations by Stagewise Orthogonal Matching Pursuit[R]. Technical Report, Department of Statistics, Stanford University.2006.
    [69]Lee H, Battle A, Raina R, et al. Efficient Sparse Coding Algorithms[A]. Proceeding of Advances in Neural Information Processing Systems[C].2007:801-808.
    [70]唐峰.基于稀疏编码与机器学习的图像内容识别算法研究[D].上海交通大学硕士学位论文,2011.
    [71]Sam T R, Lawrence K S. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science,2000,290(5500):2323-2326.
    [72]徐盛.基于主题模型的高空间分辨率遥感影像分类研究[D].上海交通大学博士学位论文,2012.
    [73]曾璞.面向语义提取的图像分类关键技术研究[D].国防科学技术大学博士学位论文,2009.
    [74]王海珍.基于LDA的人脸识别技术研究[D].西安电子科技大学硕士学位论文,2010.
    [75]江悦.场景图像内容表述和分类研究[D].国防科学技术大学博士学位论文,2010.
    [76]Palmer J, Wipf D, Kreutz K, et al. Variational EM algorithms for non-Gaussian latent variable models[J]. In Advances in Neural Information Processing Systems.2006:1059-1066.
    [77]孙显,付琨,王宏琦.基于空间语义对象混合学习的复杂图像场景自动分类方法研究[J].电子与信息学报,2011,33(2)：347-354.
    [78]Lazebnik S, Schmid C, Ponce J. Beyond Bags of Features:Spatial Pyramid Matching for Recognizing Natural Scene Categories[A]. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition[C].2006,2:2169-2178.
    [79]Pedro Q, Florent M, et al. Modeling Scenes with Local Descriptors And Latent Aspects[A]. Proceeding of 10th IEEE International Conference on Computer Vision[C],2005,1:883-890.
    [80]Cai D, Bao H J, He X F. Sparse Concept Coding for Visual Analysis [A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C].2011:2905-2910.
    [81]Pedro Q, Florent M, et al. A Thousand Words in a Scene[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(9):1575-1589.
    [82]Gemert J V, Geusebroek J M, Veenman C, et al. Kernel Codebooks for Scene Categorization[A]. Proceeding of 10th Europeon Conference on Computer Vision[C], Marseille, France,2008:696-709.
    [83]苑春法,李庆中,王韵等.统计自然语言处理基础[M].北京：电子工业出版社,2005：344-350.
    [84]Yang J J, Li Y N, Tian Y H, et al. Per-sample Multiple Kernel Approach for Visual Concept Learning[J]. EURASIP Journal on Image and Video Processing,2010,2:220-232.
    [85]Platt J C. Fast Training of Support Vector Machines Using Sequential Minimal Optimization[M]. Advances in Kernel Methods. MIT Press,1998.
    [86]Gal V, Kerre E, Nachtegael M. Multiple Kernel Learning Based Modality Classification for Medical Images[A]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops[C].2012:76-83.
    [87]Nilufar S, Ray N, Zhang H. Object Detection with DoG Scale-Space:A Multiple Kernel Learning Approach[J]. IEEE Transactions on Image Processing.2012,21(8):3744-3756.
    [88]Li J B, Sun S L. Nonlinear Combination of Multiple Kernels for Support Vector Machines[A]. Proceeding of the 20th IEEE International Conference on Pattern Recognition[C], Istanbul, Turkey,2010: 2889-2892.
    [89]Cortes C, Mohri M, Rostamizadeh A. Learning Non-linear Combination of Kernels[A]. In Advances in Neural Information Processing Systems[C].2009:396-404.
    [90]Lin Y Y, Liu T L, Fuh C S. Local Ensemble Kernel Learning for Object Category Recognition[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C], Minnesota, USA, 2007:1-8.
    [91]Malisiewicz T, Efros A A. Recognition by Association via Learning Per-exemplar Distances[A]. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition[C], Alaska, USA,2008:1-8.
    [92]Yang J J, Li Y N, Tian Y H, et al. Group-Sensitive Multi-Kernel Learning for Object Categorization[A]. Proceeding of IEEE 12th International Conference on Computer Vision[C], Kyoto, Japan,2009: 436-443.
    [93]Gionis A, Indyk P, Motwani R. Similarity Search in High Dimensions via Hashing[A]. Proceeding of the 25th International Conference on Very Large Data Bases[C], San Francisco, USA,1999:518-529.
    [94]Datar M, Indyk P. Locality-Sensitive Hashing Scheme based on p-stable Distributions[A]. Proceeding of the 20th Annual Symposium on Computational Geometry[C], New York, USA,2004:253-262.
    [95]Worring M, Snoek C G M, Koelma D C, et al. Lexicon-Based Browsers for Searching in News Video Archives[A]. Proceeding of the 18th International Conference on Pattern Recognition[C], Hong Kong, China,2006,1:1256-1259.
    [96]Pavlidis P, Weston J, Cai J, et al. Gene Functional Classification from Heterogeneous Data[A]. Proceeding of the 5th International Conference on Computational Molecular Biology[C],2001:242-248.
    [97]Rakotomamonjy A, Bach F, Canu S, et al. Simple MKL[J]. Journal of Machine Learning Research,2008, 2491-2521.
    [98]Andoni A, Indyk P. E2LSH 0.1 User Manual[OL]. http://www.mit.edu/～andoni/LSH/manual.pdf. October 20,2011.
    [99]Slaney M, Casey M. Locality-sensitive hashing for finding nearest neighbors[J]. IEEE Signal Processing Magazine,2008,25(2):128-131.
    [100]Sonnenburg S, Ratsch G, Schafer C, et al. Large Scale Multiple Kernel Learning[J]. Journal of Machine Learning Research,2006,7:1531-1565.
    [101]Yang J J, Li Y N, Tian Y H, et al. A New Multiple Kernel Approach for Visual Concept Learning[A]. Proceeding of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling[C], Berlin, Germany,2009:250-262.
    [102]Hettich R, Kortanek K O. Semi-infinite Programming:Theory, Methods and Applications [J]. SLAM Review,1993,35(3):380-429.
    [103]Marszalek M, Schmid C, Harzallah H, et al. Learning Object Representations for Visual Object Class Recognition[A]. Proceeding of the 11th International Conference on Computer Vision[C].2007:93-111.
    [104]李鹏.面向自然场景分类的稀疏编码研究与应用[D].哈尔滨工业大学硕士学位论文,2010.
    [105]Mehrtash T H, Conrad S, Richard H, et al. Sparse Coding and Dictionary Learning for Symmetric Positive Definite Matrices:A Kernel Approach[A]. Proceeding of the 12th Europeon Conference on Computer Vision[C], Firenze, Italy,2012:216-229.
    [106]Wang J J, Yang J C, Yu K, et al. Locality-Constrained Linear Coding for Image Classification[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C],2010:3360-3367.
    [107]Hoyer P O. Non-Negative Sparse Coding[A]. Proceeding of the 12th IEEE Workshop on Neural Networks for Signal Processing[C],2002:557-565.
    [108]He R, Zheng W S, Hu B G, et al. Nonnegative Sparse Coding for Discriminative Semi-Supervised Learning[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C],2011: 2849-2856.
    [109]Bishop C M. Pattern Recognition and Machine Learning[M]. Springer,2007.
    [110]Johnson R A, Wichern D W. Applied Multivariate Statistical Analysis[M]. Pearson,1982.
    [111]尚丽.稀疏编码算法及其应用研究[D].中国科学技术大学博士学位论文,2006.
    [112]Xu J, Ye G T, Wang Y, et al. Online Learning for pLSA-based Visual Recognition[A]. Proceeding of Asian Conference on Computer Vision[C],2010:95-108.
    [113]Lienou M, Maitre H, Datcu M. Semantic annotation of satellite images using Latent Dirichlet Allocation[J]. IEEE Geoscience and Remote Sensing Letters.2010,7(1):28-32.
    [114]Kawanabe M, Binder A, Muller C, et al. Multi-Modal Visual Concept Classification of Images via Markov Random Walk over Tags[A]. Proceeding of IEEE Workshop on Applications of Computer Vision[C].2011,396-401.
    [115]Bao B K, Li T, Yan S C. Hidden-Concept Driven Multi-label Image Annotation and Label Ranking[J]. IEEE Transactions on Multimedia,2012,14(1):199-210.
    [116]Yuan J, Zha Z J, Zheng Y T, et al. Utilizing Related Samples to Enhance Interactive Concept-based Video Search[J]. IEEE Transactions on Multimedia.2011,13(6):1343-1355.
    [117]刘硕研,须德,冯松鹤,刘镝,裘正定.一种基于上下文语义信息的图像视觉单词生成算法[J].电子学报,2010,38(5)：1156-1161.
    [118]Lindeberg T. Scale-Space Theory in Computer Vision[M]. Springer,1993.
    [119]Wang C, David B, Li F F. Simultaneous Image Classification and Annotation[A]. Proceeding of IEEE Computer Society Conference on Computer Vision and Pattern Recognition[C],2009:1903-1910.
    [120]Li L J, Li F F. What, Where and Who? Classifying Events by Scene and Object Recognition[A]. Proceeding of IEEE International Conference on Computer Vision[C].2007:1-8.
    [121]赵宏伟,陈霄,龙曼丽,袁世培.基于改进pLSA分类器的目标分类算法[J]：吉林大学学报(工学版)2012,42(1)：231-235.
    [122]Tang Y J, Xu D, Gua G H, et al. Category Constrained Learning Model for Scene Classification[J]. IEICE Transactions on Information and Systems.2009, E92D(2):357-360.
    [123]Alessandro P, Marco C, Vittorio M. Learning Natural Scene Categories by Selective Multi-Scale Feature Extraction[J]. Image and Vision Computing,2010,28(6):927-939.
    [124]Su J H, Chou C L, Lin C Y, et al. Effective Semantic Annotation by Image-to-concept Distribution Model[J]. IEEE Transaction on Multimedia.2011,13(3):530-538.
    [125]Merler M, Huang B, Xie L X, et al. Semantic Model Vectors for Complex Video Event Recognition [J]. IEEE Transactions on Multimedia,2012,14(1):88-101.
    [126]Li L, Jiang S Q, Huang Q M. Learning Image Vicept Description via Mixed-Norm Regularization for Large Scale Semantic Image Search[A]. Proceeding of IEEE Conference on Computer Vision and Pattern Recognition[C].2011:825-832.
    [127]傅兴玉,尤红建,付琨.基于改进Markov随机场的高分辨率SAR图像建筑物分割算法[J].电子学报,2012,40(6)：1141-1147.
    [128]陆明俊,王润生.计算机视觉中的Markov随机场方法[J].电子科学学刊,2000,22(6)：1028-1037.
    [129]蔡涛,徐国华,徐筱龙.基于模糊C均值与Markov随机场的图像分割[J].计算机工程,2007,33(20)：259-302.
    [130]Yang W, Dai D X, Triggs B, et al. SAR-Based Terrain Classification Using Weakly Supervised Hierarchical Markov Aspect Models[J]. IEEE Transactions on Image Processing.2012,21(9): 4232-4243.
    [131]殷慧,曹永峰,孙洪.基于多维金字塔表示和AdaBoost的高分辨率SAR图像城区场景分类算法[J].自动化学报,2010,36(8)：1099-1106.
    [132]祝文骏.基于视觉皮层网络的物体整体特征分析与算法研究[D].上海交通大学博士学位论文,2011.
    [133]曹娟,张勇东,李锦涛,唐胜.一种基于密度的自适应最优LDA模型选择方法[J].计算机学报, 2008,31(10)：1780-1787.