面向感知的图像检索及自动标注算法研究

英文题名：Research on Perception Oriented Image Retrieval and Automatic Image Annotation
作者：冯松鹤
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：图像检索 ; 显著性分析 ; 相关反馈 ; 流形排序 ; 自动图像标注 ; 半监督学习 ; 多示例多标记学习 ; 典型性分析
英文关键词：image retrieval ; saliency analysis ; relevance feedback ; manifold-ranking ; automatic image annotation ; semi-supervised learning ; multi-instance multi-label learning ; typicality analysis
学位年度：2009
导师：须德
学科代码：081203
学位授予单位：北京交通大学
论文提交日期：2009-01-01
答辩委员会主席：戴国忠

摘要

随着多媒体技术和计算机网络技术的发展,人们接触到的图像数据以前所未有的速度增长,面对海量的图像资源,用来有效地分析、组织和管理图像数据的基于内容的图像检索系统成为多媒体技术的研究热点。研究的难点在于如何使计算机能够从人的认知角度来理解图像语义信息,最大程度地弥合图像低层特征和高层语义之间的语义鸿沟。论文工作的前半部分主要研究面向用户感知的图像检索算法,着重讨论如何从图像中提取出符合用户感知的语义内容,以及如何有效地融入用户的高层语义来改进图像检索的性能。论文工作的后半部分主要研究自动图像标注算法,着重讨论如何建立有效的机器学习模型来描述并解决标注问题,以及如何改进训练样本的有效性以改善图像标注的性能。
     在基于区域的图像检索上,准确地提取出图像中用户感兴趣的部分是解决图像检索问题的关键。针对图像中只存在部分区域符合用户检索意图这一歧义性问题,提出一种完全数据驱动的,基于选择视觉注意力机制的图像检索算法。该算法首先利用注意力模型生成显著图,通过与边缘图及分割图的结合自动地选取图像中的显著边缘和显著区域,并以此来表征用户的检索意图,然后采用有效的视觉特征描述图像的显著信息,最后通过特征融合策略实现了基于语义的图像检索。
     在基于机器学习理论与相关反馈机制相结合的图像检索上,着重研究了基于图的半监督学习算法在区域图像检索中的应用。在无用户反馈以及用户只反馈正例图像的情形下,将图像检索问题转化为直推式学习问题,构建融合区域显著性信息的层次化图模型,利用流形排序算法实现标记传播:在用户同时反馈正反例图像的情形下,利用反馈得到的正反例图像构建区域级相似性邻接矩阵,通过图学习算法进行迭代计算,选择出符合用户查询语义的区域级特征向量集合,并利用该特征向量集合实现区域级的图像检索。
     在自动图像标注算法上,通过分析自动图像标注中存在的输入空间和输出空间的歧义性,提出一种基于半监督学习框架下的多示例多标记学习算法实现图像自动标注。该算法首先提出一种改进的多样性密度求解方法,用以衡量训练正包和未标记包中各个示例与给定关键词的语义相似性;然后选取多样性密度值满足预定条件的示例作为表征给定语义关键词的示例原型,采用高斯混合模型对其进行语义建模;在此基础上,提出一种有效的特征映射策略,用以将正包集合和未标记包集合进行重定义;最后采用一种基于图的半监督学习算法来完成给定语义关键词的标记传播。
     在图像标注性能改善上,针对现有的标注算法大多没有考虑到训练样本表征关键词典型程度的问题,提出了一种基于核密度估计思想的样本表征关键词的置信权值计算方法,通过[0-1]间的实数值表示来反映训练样本表征关键词的典型程度。在此基础上,利用改进的Citation-kNN多示例学习算法求解待标注图像的类别标记。该算法无需求解每个关键词对应的目标示例,而是基于惰性学习的思想直接实现了待标注图像包级别的类别判定,从而完成图像标注的任务。
With the development of multimedia technology and computer network,the content-based image retrieval(CBIR) system becomes more and more important to organize,index and retrieve the massive image information in many application domains,which has emerged as a hot topic in recent years.The main difficulty of CBIR lies in how to make computers understand the semantic information of images from the human's perceiving view,and narrow down the well known semantic gap between low-level visual features and high-level semantic concepts.The former part of this thesis mainly focuses on the human perception oriented image retrieval algorithm, especially about how to extract the semantic information from the image and how to effectively integrate into the human's high-level semantics to improve the retrieval performance.The latter part of this dissertation mainly focuses on the automatic image annotation,especially about how to establish an effective machine learning model to resolve the annotation problem,as well as how to improve the effectiveness of training samples in order to refine the annotation performance.
     For the region-based image retrieval,the author argues that in most cases the user is only interested in a portion of the image,and the rest of the image is irrelevant.In order to resolve such ambiguous problem,a totally data-driven,selective visual attention model based image retrieval algorithm is proposed.Firstly the saliency map is generated via the attention model,and both salient edges and salient regions are extracted automatically by fusing the edge map and segmented image with the corresponding saliency map,which can be regarded as the user's retrieval intention.Then the effective feature descriptors are proposed and fused for the final image semantic retrieval.
     For the image retrieval task which combines machine learning theory with relevance feedback mechanism,the dissertation focuses on the graph-based semi-supervised learning algorithm with application to region-based image retrieval.Different schemes which both incorporate the region saliency into the graph-based semi-supervised learning framework are applied to deal with two types of feedback.Firstly,in the case that no sample or only positive samples are available from the user's feedback,the retrieval task can be resolved via a transductive learning manner,a hierarchical graph model which incorporates region saliency information is constructed and the manifold-ranking algorithm is adopted subsequently for positive label propagation.Secondly,in the case that the user provides both positive and negative samples,the region-level adjacency matrix will be constructed via the feedback samples,and the manifold-ranking algorithm is also adopted here to choose instances which truly represent the user's query semantics.The selected instances are then used to retrieve the relevant samples.
     For the automatic image annotation,by analyzing the fact that the annotation issue exist ambiguity both in the input space and output space,the dissertation presents a novel semi-supervised multi-instance multi-label(SSMIML) learning framework,which aims at taking full advantage of both labeled and unlabeled data to address the annotation problem.Specifically,a reinforced diverse density algorithm is applied firstly to select the instance prototypes(IPs) with respect to a given keyword from both positive and unlabeled bags.Then,the selected IPs are modeled using the Gaussian mixture model(GMM) in order to reflect the semantic class density distribution. Furthermore,based on the class distribution for a keyword,both positive and unlabeled bags are redefined using a novel feature mapping strategy.Thus,each bag can be represented by one fixed-length feature vector so that the manifold-ranking algorithm can be used subsequently to propagate the corresponding label from positive bags to unlabeled bags directly.
     For the image annotation refinement,most existing algorithms rarely take into account the fact that,for the samples relevant to a certain keyword,their typicalities or relevancy scores to the keyword are generally different.Inspired by the kernel density estimation,the dissertation proposes a confidence weight computation algorithm,which uses a real num between[0-1]to represent the sample's relevancy score to a certain keyword.Moreover,an improved Citation-kNN multiple-instance learning algorithm is proposed to solve the annotation issue.In contrast with the existing annotation algorithm which intends to learn an explicit correspondence between keywords and target concepts,the proposed method can directly annotate the keywords to the unlabeled images based on the lazy learning style approach.

引文

[1]Flick,http://www.flickr.com,2008.
    [2]Edkins J,Graham M.Content-based image retrieval.Technical Report,University of Northumbria at Newcastle,1999.
    [3]Datta R,Joshi D,Li J,Wang J Z.Image retrieval:ideas,influences,and trends of the new age.ACM Computing Surveys,2008,40(2):1-60.
    [4]Liu Y,Zhang D S,Lu G,Ma W Y A survey of content-based image retrieval with high-level semantics.Pattern Recognition,2007,40(1):262-282.
    [5]Carson C,Belongie S,Greenspan H,Malik J.Blobworld:Image segmentation using expectation-maximization and its application to image querying.IEEE Trans,on Pattern Analysis and Machine Intelligence,2002,24(8):1026-038.
    [6]Wang J Z,Li J,Wiederhold G SIMPLIcity:semantics-sensitive integrated matching for picture libraries.IEEE Trans,on Pattern Analysis and Machine Intelligence,2001,23(9):947-963.
    [7]Flickner M,Sawhney H,et al.Query by image and video content:the QBIC System.IEEE Computer,1995,28(9):23-32.
    [8]Pentland A,Picard R W,Sclaoff S.Photobook:tools for content-based manipulation of image database.In:Proc.of SPIE,Vol.2185 (1994)34-47.
    [9]Smith J R,Chang S F.VisualSEEK:a fully automated content-based image query system.In:Proc.of Int.Conf.on ACM Multimedia (ACM Multimedia'96).Juan-les-Pins,France,Nov.1996:87-98.
    [10]Ma W Y,Manjunath B.Netra:a toolbox for navigating large image databases.In:Proc.of IEEE Int.Conf.on Image Processing (ICIP'97),Santa Barbara,USA,Oct.1997:568-571.
    [11]Bach J R,et al.The virage image search engine:an open framework for image management.In:Proc.of SPIE:Storage and Retrieval for Still Image and Video Database IV2670,1996:76-87.
    [12]Rui Y,Huang T S,et al.Automatic matching tool selection using relevance feedback in MARS.In:Proc.of Int.Conf.on Vis.Inf.Retrieval,1998:45-50.
    [13]Jing,F,Li,M,Zhang,H J,Zhang B.An Efficient and Effective Region-Based Image Retrieval Framework.IEEE Trans,on Image Processing,2004,13(5):699-709.
    [14]Jing F,Li M,Zhang H J,Zhang B.A Unified Framework for Image Retrieval Using Keyword and Visual Features.IEEE Trans,on Image Processing,2005,14(7):979-989.
    [15]景风.高效准确的基于内容的图像检索研究.清华大学,博士学位论文.2004.
    [16]Ko B C,Byun H.Frip:a region-based image retrieval tool using automatic image segmentation and stepwise Boolean and matching.IEEE Trans.on Multimedia,2005,7:105-113.
    [17]Liu Y,Zhang D,Lu G.Region-based image retrieval with high-level semantics using decision tree leaming.Pattern Recognition,2008,41(8):2554-2570.
    [18]Chen Y,Wang J Z.A region-based fuzzy feature matching approach to content-based image retrieval.IEEE Trans.on Pattern Analysis and Machine Intelligence,2002,24(9):1252-1267.
    [19]Zhang R,Zhang Z.Hidden semantic concept discovery in region based image retrieval.In:Proc.of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'04),Washington,DC,USA,Jun.2004,2:996-1001.
    [20]韩军伟.基于内容的图像检索技术研究.西北工业大学,博士学位论文.2003.
    [21]Han J,Ngan K N,Li M J,Zhang H J.Unsupervised extraction of visual attention objects in color images.IEEE Trans.on Circuits and Systems for Video Technology,2006,16(1):141-145.
    [22]Fu H,Chi Z,Feng D.Attention-driven image interpretation with application to image retrieval.Pattern Recognition,2006,39(9):1604-1621.
    [23]Ko B C,Kwak S Y,Byun H.SVM-based Salient Region(s) Extraction Method for Image Retrieval.In:Proc.of IEEE Int.Conf.on Pattern Recognition(ICPR'04),Cambridge,UK,Aug.2004:977-980.
    [24]Kwak S Y,Ko B C,Byun H.Automatic salient object extraction using the contrast map and salient points.In:Proc.of Pacific-Rim Conf.on Multimedia(PCM'04),Tokyo,Japan,2004:138-145.
    [25]Kim S,Park S,Kim M.Central object extraction for object-based image retrieval.In:Proc.of Int.Conf.on Image and Video Retrieval(CIVR'03),Urbana,USA,Jul.2003:39-49.
    [26]Ko B C,Nam J Y.Automatic object-of-interest segmentation from natural images.In:Proc.of Int.Conf.on Pattern Recognition(ICPR'06),Hong Kong,China,Sep.2006:45-48
    [27]Rui Y,Huang T S,Ortega M,Mehrotra S.Relevance feedback:A power tool for interactive content-based image retrieval.IEEE Trans.on Circuits and Systems for Video Technology,1998,8(5):644-655.
    [28]Tian Q,Hong P,Huang T S.Update relevant image weights for content-based image retrieval using support vector machines.In:Proc.of Int.Conf.on Multimedia and Expo(ICME'00),New York City,USA,Jul.2000:1199-1202.
    [29]Zhu X S,Huang T S.Relevance feedback in image retrieval:a comprehensive review. Multimedia System,2003,8(6):536-544.
    [30]吴洪,卢汉清,马颂德,基于内容图像检索中相关反馈技术的回顾,计算机学报,2005,28(12):1969-1979.
    [31]Vailaya A,Figueiredo M,Jain A K,Zhang H J.Image classification for content-based indexing.IEEE Trans.on Image Processing,2001,10(1):117-130.
    [32]Luo J,Savakis A.Indoor vs outdoor classification of consumer photographs using low-level and semantic features.In:Proc.of IEEE Int.Conf.on Image Processing(ICIP'01),Thessaloniki,Greece,Oct.2001:745-748.
    [33]Qi X,Han Y.A novel fusion approach to content-based image retrieval.Pattern Recognition,2005,38(12):2449-2465.
    [34]Zhang L,Liu F,Zhang B.Support vector machine learning for image retrieval.In:Proc.of Int.Conf.on Image Processing(ICIP'01),Thessaloniki,Greece,Oct.2001:721-724.
    [35]Gondra I,Heisterkamp D.Learning in Region-Based Image Retrieval with Generalized Support Vector Machines.In Proc.of Int.Conf.on Computer Vision and Pattern Recognition Workshops(CVPRW'04),Washington,DC,USA,Jun.2004:149-157.
    [36]Tong S,Chang E.Support vector machine active learning for image retrieval.In Proc.of Int.Conf.on ACM Multimedia(ACM Multimedia'01),Ottawa,Canada,2001:107-118.
    [37]陈毅松,汪国平,董士海.基于支持向量机的渐进直推式分类学习算法,软件学报,2003,14(3):451-460.
    [38]Chen Y,Zhou X,Huang T S.One-class SVM for learning in image retrieval.In:Proc.of Int.Conf.on Image Processing(ICIP'01),Oct.2001,1:34-37.
    [39]Zhang C,Chen T.An active learning framework for content-based information retrieval.IEEE Trans.on Multimedia,2002,4(2):260-268.
    [40]Chen S,Rubin S,Shyu M,Zhang C.A dynamic user concept pattern learning framework for content-based image retrieval.IEEE Trans.on Systems,Man,and Cybernetics-Part C:Applications and Reviews,2006,36(6):772-783.
    [41]Rahmani R,Goldman S A,Zhang H,Cholleti S R,Fritts J E.Localized content based image retrieval.IEEE Trans.on Pattern Recognition and Machine Intelligence,2008,30(11):1902-1912.
    [42]Rahmani R,Goldman S A,Zhang H,Krettek J,Fritts J.Localized content based image retrieval.In:Proc.of ACM Int.Conf.on Multimedia Information Retrieval(ACM MIR'05),Singapore,Nov.2005:227-236.
    [43]Zhang Q,Goldman S,Yu W,Fritts J.Content-based image retrieval using multiple instance learning.In:Proc.of Int.Conf.on Machine Learning(ICML'02),Sydney,Australia,Jul. 2002:682-689.
    [44]Yang C,Lozano-Perez T.Image database retrieval with multiple instance techniques.In:Proc.of Int.Conf.on Data Engineering(ICDE'00),San Diego,CA,USA,Mar.2000:233-243.
    [45]Zhou Z H,Xue X B,Jiang Y.Locating regions of interest in CBIR with multi-instance learning techniques.In:Proc.of Australian Joint Conf.on Artificial Intelligence(AJCAI'05),Sydney,Australia,May.2005:92-101.
    [46]Zhou Z H,Jiang K,Li M.Multi-instance learning based web mining.Applied Intelligence,2005,22:135-147.
    [47]Zhang C C et al.A multiple instance learning approach for content based image retrieval using one-class support vector machine.In:Proc.of Int.Conf.on Multimedia and Expo (ICME'05),Amsterdam,Netherlands,Jul.2005:1142-1145.
    [48]He J,Li M,Zhang H J,Tong H H,Zhang C.Generalized manifold-ranking based image retrieval.IEEE Trans.on Image Processing,2006,15(10):3170-3177.
    [49]Tong H H,He J,Li M,Ma W Y,Zhang H J,Zhang C.Manifold-ranking based keyword propagation for image retrieval EURASIP Journal of Applied Signal Processing,Special Issue on Information Mining from Multimedia Database,2006,21:1-10.
    [50]He J,Li M,Zhang H J,Tong H H,Zhang C.Manifold-ranking based image retrieval.In:Proc.of the 12th ACM Multimedia(ACM Multimedia'04),New York,USA,Oct.2004:9-16.
    [51]Zhou Z H,Chen K J,Dai H B.Enhancing relevance feedback in image retrieval using unlabeled data.ACM Trans.on Information Systems,2006,24(2):219-244.
    [52]Wu Y,Tian Q,Huang T S.Discriminant EM algorithm with application to image retrieval.In:Proc.of IEEE Int.Conf.on Computer Vision and Pattern Recognition(CVPR'00),South Carolina,USA,Jun.2000,1:155-162.
    [53]刘静.网络图像检索系统中关键技术的研究.中国科学院自动化研究所,博士学位论文.2008.
    [54]Hu Y,Xie X,Ma W,Rajan D,Chia L.Salient object extraction combining visual attention and edge information.Technical Report,2004.
    [55]Deng Y,Manjunath B S.Unsupervised segmentation of color-texture regions in images and video.IEEE Trans.on Pattern Analysis and Machine Learning,2001,23(8):800-810.
    [56]Shi J,Malik J.Normalized cuts and image Segmentation.IEEE Trans.on Pattern Analysis and Machine Intelligence,2000,22(8):888-905.
    [57]Itti L,Koch C,Niebur E.A model of saliency-based visual attention for rapid scene analysis.IEEE Trans.on Pattern Analysis and Machine Intelligence,1998,20(11):1254-1259.
    [58]Treisman A M,Gelade G.A feature-integration theory of attention.Cognitive Psychology, 1980,12(1):97-136.
    [59]Ma Y F,Zhang H J.Contrast-based image attention analysis by using fuzzy growing.In:Proc.of Int.Conf.on ACM Multimedia (ACM Multimedia'03),Berkeley,CA,USA,Nov.2003:374-381.
    [60]Sun Y,Fisher R.Object-based visual attention for computer vision.Artificial Intelligence,2003,146:77-123.
    [61]张志勇,施智平,石志伟,史忠植．基于轮廓的图像检索,软件学报,2008,19(9)：2461-2470．
    [62]Walther D,Rutishauser U,Koch C,et al.Selective Visual Attention Enables Learning and Recognition of Multiple Objects in Cluttered Scenes.Computer Vision and Image Understanding,2005,100(2):41-63.
    [63]Salah A A,Alpaydin E,Akarun L.A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition.IEEE Trans,on Pattern Analyze and Machine Intelligence,2002,24(3):420-425.
    [64]Ma Y Lu L,Zhang H J,Li M.A user attention model for video summarization.In:Proc.of ACM Int.Conf.on Multimedia (ACM Multimedia'02),Juanles-Pins,France,Dec.2002:533-542.
    [65]Chen L,Xie X,Fan X,Ma W Y,Zhang H J,Zhou H.A visual attention model for adapting images on small displays.Multimedia System,2003,9:353-364.
    [66]Liu H,Jiang S,Huang Q,Xu C,Gao W.Region-based visual attention analysis with its application in image browsing on small displays.In:Proc.of ACM Int.Conf.on Multimedia (ACM Multimedia'07),Augsburg,Germany,Sep.2007:305-308.
    [67]Liu J,Liu Q,Wang J,Lu H Q.Web image mining based on modeling concept-sensitive salient regions.In:Proc.of IEEE Int.Conf.on Multimedia and Expo (ICME'06),Toronto,Canada,Jul.2006:505-508.
    [68]Liu T,Sun J,Zheng N,Tang X,Shum H.Learning to detect a salient object.In:Proc.of IEEE Computer Society Conf.on Computer Vision and Pattern Recognition (CVPR'07),Santa Barbara,USA,Jun.2007:309-318.
    [69]Shashua A,Ullman S.Structural saliency:the detection of globally salient structures using a locally connected network.IEEE Trans,on Pattern Analysis and Machine Intelligence,1988,7(1):90-94.
    [70]Wang S,Kubota T,Siskind J,Wang J.Salient closed boundary extraction with ratio contour.IEEE Trans,on Pattern Analysis and Machine Intelligence,2005,27(4):546-561.
    [71]Elder J,Zucker S.Computing contour closure.In:Proc.of European Conference on Computer Vision(ECCV'96),Cambridge,UK,Apr.1996:399-412.
    [72]Zhou X S,Huang T S.Edge-based structural features for content-based image retrieval.Pattern Recognition Letters,2001,22(5):457-468.
    [73]Han J W,Guo L.A shape-based image retrieval method using salient edges.Signal Processing:Image Communication,2003,18(2):141-156.
    [74]Payne A,Singh S.Indoor vs.outdoor scene classification in digital photographs.Pattern Recognition,2005,38(10):1533-1545.
    [75]Chang R F,Chen C J,Liao C H.Region-based image retrieval using edgeflow segmentation and region adjacency graph.In:Proc.of IEEE International Conference on Multimedia &Expo(ICME'04),Taipei,China,Jun.2004,1883-1886.
    [76]Li C Y,Hsu C T.Image retrieval with relevance feedback based on graph-theoretic region correspondence estimation.IEEE Trans.on Multimedia,2008,10(3):447-456.
    [77]Cheng H D,Chen Y H,Jiang X H.Thresholding Using Two-Dimensional Histogram and Fuzzy Entropy Principle.IEEE Trans.on Image Processing,2000,9(4):732-735.
    [78]Maron O,Lozano-Perez T.A framework for multiple-instance learning.In:Proc.of Advances in Neural Information Processing Systems(NIPS'98),1998,10:570-576.
    [79]Maron O,Lozano-Perez T.Multiple-instance learning for natural scene classification.In:Proc.of Int.Conf.on Machine Learning(ICML'98),Madison,Wisconsin,USA,Jul.1998:341-349.
    [80]Wang J,Zucker J D.Solving the multiple-instance problem:a lazy learning approach.In:Proc.of Int.Conf.on Machine Learning(ICML'00),San Francisco,CA,USA,Jun.2000:1119-1125.
    [81]Zhang Q,Goldman S A.EM-DD:an improved multi-instance learning technique,In:Proc.of Advances in Neural Information Processing Systems 14(NIPS'02),Cambridge,MA:MIT Press,2002:1073-1080.
    [82]徐杰.基于小样本学习的图像检索研究.上海交通大学,博士学位论文.2004.
    [83]Zhou D,Bousquet O,Lal T N,Weston J,Scholkopf B.Learning with local and global consistency.In:Proc.of Advances in Neural Information Processing Systems(NIPS'03),2003:321-328.
    [84]Zhou D,Weston J,Gretton A,Bousquet O,et al.Ranking on Data Manifolds.In:Proc.of Advances in Neural Information Processing Systems(NIPS'03),2003:169-176.
    [85]Wan X.Content based image retrieval using manifold-ranking of blocks.In:Proc.of IEEE Int.Conf.on Multimedia and Expo(ICME'07),Beijing,China,Jul.2007:2182-2185.
    [86]Li F,Dai Q,Xu W,Er G.Correlated probabilistic label propagation for region-based image retrieval.In:Proc.of IEEE Int.Conf.on Acoustics,Speech and Signal Processing (ICASSP'07),Honolulu,Hawaii,USA,Apr.2007:789-792.
    [87]Rahmani R,Goldman S A.MISSL:multiple-instance semi-supervised learning.In:Proc.of Int.Conf.on Machine Learning(ICML'06),Pittsburgh,USA,Jun.2006:705-712.
    [88]Zhou Z H,Zhang M L.Multi-instance multi-label learning with application to scene classification.In:Proc.of Advances in Neural Information Processing Systems(NIPS'06),2006:1609-1616.
    [89]张敏灵.多示例学习与多标记学习的研究.南京大学,博士学位论文.2007.
    [90]Boutell M R,Luo J,Shen X,Brown C M.Learning multi-label scene classification.Pattern Recognition,2004,37(9):1757-1771.
    [91]Mori Y,Takahashi H,Oka R.Image-to-word transformation based on dividing and vector quantizing images with words.In:Proc.of First Intl.Workshop on Multimedia Intelligent Storage and Retrieval Management(MISRM'99),Orlando,Oct.1999.
    [92]Duygulu P,Barnard K,Freitas N,Forsyth D.Object recognition as machine translation:learning a lexicon for a fixed image vocabulary.In:Proc.of European Conference on Computer Vision(ECCV'02),Copenhagen,Denmark,May.2002:97-112.
    [93]Barnard K,Duygulu P,Freitas N,Forsyth D,Blei D,Jordan M I.Matching words and pictures.Journal of Machine Learning Research,2003,3:1107-1135.
    [94]Jeon J,Lavrenko V,Manmatha R.Automatic image annotation and retrieval using cross-media relevance models.In:Proc.of Int.ACM SIGIR Conf.on Research and Development in Information Retrieval(ACM SIGIR'03),Toronto,Canada,Jul.2003:119-126.
    [95]Lavrenko V,Manmatha R,Jeon J.A model for learning the semantics of pictures.In:Proc.of Advances in Neural Information Processing Systems(NIPS'03),2003.
    [96]Feng S,Manmatha R,Lavrenko V.Multiple bernoulli relevance models for image and video annotation.In:Proc.of IEEE Int.Conf.on Computer Vision and Pattern Recognition (CVPR'04),Washington DC,USA,Jun.2004:1002-1009.
    [97]Monay F,Gatica-Perez D.Modeling semantic aspects for cross-media image indexing.IEEE Trans.on Pattern Analysis and Machine Intelligence,2007,29(10),1802-1817.
    [98]Liu J,Li M,Ma W Y.An adaptive graph model for automatic image annotation.In:Proc.of ACM Int.Workshop on Multimedia Information Retrieval(ACM MIR'06),Santa Barbara,USA,Oct.2006:61-70.
    [99]Pan J Y,Yang H J,Duygulu P,Faloutsos C.Automatic image captioning.In:Proc.of IEEE Int.Conf.on Multimedia and Expo(ICME'04),Taipei,China,Jun.2004,1:987-990.
    [100]Liu J,Wang B,Lu H Q,Ma S D.A graph-based image annotation framework.Pattern Recognition Letters,2008,29(4):407-415.
    [101]Liu J,Li M,Liu Q,Lu H,Ma S.Image annotation via graph leaming.Pattem Recognition,2009,42(2):218-228.
    [102]Li Y,Shapiro L G,Bilmes J A.A generative/discriminative leaming algorithm for image classification.In:Proc.of IEEE Int.Conf.on Computer Vision(ICCV'05),Beijing,China,Oct.2005:1605-1612.
    [103]Chen Y,Bi J,Wang J Z.MILES:multiple-instance learning via embedded instance selection.IEEE Trans.on Pattern Analysis and Machine Intelligence,2006,28(12):1931-1947.
    [104]Cameiro G,Chan A B,Moreno P J,Vasconcelos N.Supervised learning of semantic classes for image annotation and retrieval.IEEE Trans.on Pattern Analysis and Machine Intelligence,2007,29(3):394-410.
    [105]Qi X,Han Y.Incorporating multiple SVMs for automatic image annotation.Pattern Recognition,2007,40(2):728-741.
    [106]Tang J,Lewis P H.A study of quality issues for image auto-annotation with the corel dataset.IEEE Trans.on Circuits and Systems for Video Technology,2007,17(3):384-389.
    [107]Cusano C,Ciocca G,Schettini R.Image annotation using SVM.In:Proc.of Internet Imaging,SPIE 5304,2004.
    [108]Yang C,Dong M,Fotouhi F.Region-based image annotation through multiple-instance leaming.In:Proc.of ACM Int.Conf.on Multimedia(ACM Multimedia'05),Singapore,Nov.2005:435-438.
    [109]Yang C,Dong M,Hua J.Region-based image annotation using asymmetrical support vector machine-based multiple-instance leaming.In:Proc.of IEEE Int.Conf.on Computer Vision and Pattern Recognition(CVPR'06),New York,USA,Jun.2006:2057-2063.
    [110]王梅,周向东,张军旗,许红涛,施伯乐.基于扩展生成语言模型的图像自动标注方法,软件学报,2008,19(9):2449-2460.
    [111]Rissanen J.Modelling by shortest data description.Automatica,1978,14:465-471.
    [112]Li J.Wang J Z.Automatic linguistic indexing of pictures by statistical modeling approach.IEEE Trans.on Pattern Analysis and Machine Intelligence,2003,25(9):1075-1088.
    [113]Wang C,Jing F,Zhang L,Zhang H J.Image annotation refinement using random walk with restarts.In:Proc.of ACM Int.Conf.on Multimedia(ACM Multimedia'06),Santa Barbara,CA,Oct.2006:647-650.
    [114]Li X,Chert L,Zhang L,Lin F,Ma W Y.Image annotation by large-scale content-based image retrieval.In:Proc.Of the 14th ACM Multimedia(ACM Multimedia'06),Santa Barbara,USA, Oct.2006:607-610.
    [115]Jin Y,Khan L,Wang L.Image annotation by combining multiple evidence wordnet.In:Proc.of ACM Int.Conf.on Multimedia(ACM Multimedia'05),Singapore,Nov.2005:706-715.
    [116]Chen Y,Wang J Z.Image categorization by leaming and reasoning with regions.Journal of Machine Learning Research,2004,(5):913-939.
    [117]Li W.Random texts exhibit zipfs-law-like word frequency distribution.IEEE Trans.on Information Theory,1992,38(6).
    [118]Wang Y,Mei T,Gong S G;Hua X S.Combining global,regional and contextual features for automatic image annotation.Pattern Recognition,2009,42(2):259-266.
    [119]Zhang M L,Zhou Z H.Multi-instance clustering with applications to multi-instance prediction.Applied Intelligence,in press.
    [120]Tang J,Hua X S,Qi G J,Wu X.Typicality ranking via semi-supervised multiple-instance learning.In:Proc.of ACM Int.Conf.on Multimedia(ACM Multimedia'07),Augsburg,Germany,Sep.2007:297-300.
    [121]Tang J,Hua X S,Qi G J,Gu Z W,Wu X.Beyond accuracy:typicality ranking for video annotation.In:Proc.of IEEE Int.Conf.on Multimedia and Expo(ICME'07),Beijing,China,Jul.2007:647-650.
    [122]Vogel J,Schiele B.Semantic modeling of natural scenes for content-based image retrieval.International Journal of Computer Vision,2007,72(2):133-157.
    [123]唐金辉.视频语义标注的若干问题研究.中国科学技术大学,博士学位论文.2008.
    [124]Kang F,Jin R,Sukthankar R.Correlated label propagation with application to multi-label learning.In:Proc.of IEEE Computer Society Cone on Computer Vision and Pattern Recognition(CVPR'06),New York,USA,Jun.2006:1719-1726.
    [125]Wang X J,Zhang L et al.AnnoSearch:Image auto-annotation by search.In Proc.of IEEE Computer Society Conf.on Computer Vision and Pattern Recognition(CVPR'06),New York,USA,Jun.2006:1483-1490.
    [126]Wang X J,Zhang L,Li X,Ma W Y.Annotating images by mining image search results.IEEE Trans.On Pattern Analysis and Machine Intelligence,2008,30(11):1919-1932.