面向人机交互的单目视频三维人体姿态估计研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

面向人机交互的单目视频三维人体姿态估计研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

英文题名：Research on Human Pose Estimation with Monocular Videos for HCI Applications
作者：李娜
论文级别：博士
学科专业名称：计算机科学与技术
学位年度：2008
导师：陈纯
学科代码：081203
学位授予单位：浙江大学
论文提交日期：2008-07-01
答辩委员会主席：董金祥

摘要

自动理解图像或者视频序列中的运动人体,一直是计算机视觉研究的重点。除了人类对通过机器探索和仿造自身的兴趣外,促使其成为研究热点的一个重要原因是电子设备的迅猛发展和由其带来的巨大应用市场。本文针对人机交互应用,着重研究单目视频下三维人体姿态估计。
     单目视频三维人体姿态估计是计算机视觉研究中最具挑战性的问题之一。系统的观测输入为复杂自然图像,状态输出为高维人体姿态,由观测到状态的系统过程是动态且非线性的。此外,面向人机交互应用时,单目视频三维人体姿态估计系统的核心算法需同时满足准确、鲁棒和实时性要求,系统初始化过程应尽可能自动化。针对以上问题,本文依照模块分别展开研究,并将各部分算法集成至人机交互原型系统,从而实现基于单目视频三维人体姿态估计的人机交互。
     本文将单目视频三维人体姿态估计研究划分为三部分关键技术:图像特征提取、人体姿态估计算法以及初始化过程的自动化。其中,图像特征提取研究针对普通低端摄像设备,提出了基于HSV色彩空间的图像特征提取算法,通过采用与人眼视觉感知一致的HSV空间提高图像特征提取的有效性和鲁棒性。针对人体姿态估计算法,本文提出了判别模型和生成模型相结合的三维人体姿态估计数学模型。通过判别模型确定目标姿态的子空间,进而通过生成模型求解目标姿态,充分发挥了判别式模型和生成式模型各自的优势。针对系统初始化过程,本文重点介绍了手工分割视频对象的框架和评价标准,为用户辅助采集训练数据提供便利,减少用户在系统初始化过程中的交互工作量。
     根据以上核心算法设计,本文自行开发了基于肢体运动控制的新式人机交互实时系统。为验证系统的有效性,本文进一步开发了一款使用普通网络摄像头交互的简易游戏,为探讨基于人体运动的人机交互设计方法建立了实验平台。通过该平台,本文进行大量用户测试,并探讨这种新型人机交互在全新设计环境下面临的问题和机遇。测试结果表明了本文所提出的单目三维人体姿态估计系统的有效性,同时展示了此类基于人体运动的新型交互系统的独特魅力和广阔应用前景。
Automatically analyzing and understanding human motion has been an important field of computer vision research for many years. The interests are inspired by not only human curiosity of exploring and imitating ourselves via computer but also the large potential market growing with the prevalence of personal computers and consume electronics. This thesis focuses on the problem of 3D human pose estimation with monocular camera for novel human computer interaction (CHI).
     Monocular 3D human pose estimation is one of the most challenging topics in computer vision. The difficulties lie in both the input and the output. The observation of the system is always complicated natural image, while the system state within a high-dimensional space. Inference from the observation to the state is essentially a nonlinear dynamic process. Moreover, a monocular 3D human pose estimation system has to be accurate, robust and real-time for CHI applications and the system initialization procedure should involve users as less as possible. With these requirements, we'have designed algorithms for all modules of a monocular 3D human pose estimation system and integrated them into a CHI prototype system; therefore, a CHI system based on monocular 3D human pose estimation is implemented.
     In this work, we define three key technologies for monocular 3D human pose estimation: image feature extraction, human pose estimation and automatically initialization. Our research on image feature extraction targets commonly-used low-end cameras, such as web-cameras. We adopt HSV color space, which is consistent with human visual system, to improve the effectiveness and robustness of image feature extraction. As far as the human pose estimation is concerned, we propose a hybrid model, combining discriminative model and generative model, to estimating 3D pose. The algorithm firstly locates a local subspace of human pose by a discriminative model, and then refines the pose within the local subspace by a generative model. In this way, the model takes on advantages of both models. As to automatic initialization, we focus on semi-automatic video object segmentation and evaluation metrics. An efficient tool for video object segmentation could help users provide training data easily and consequently reduce users' manual work during initialization.
     Based on all the proposed algorithms, we develop a novel CHI system based human body movement. To further evaluate the CHI system, a web-camera based video game is implemented, which could be used for interaction design. Based on this game, we carry out a user study and discuss the problems and opportunities for the novel CHI system. The result of user study demonstrates the effectiveness of the proposed monocular 3D human pose estimation system, meanwhile shows us the attractiveness and brilliant future of the novel CHI system based on human movement

引文

[1]Sony Eyetoy.http://www.eyetoy.com.
    [2]Nintendo Wii.http://www.wii.com.
    [3]Phoenix Technologies Inc.http://www.ptiphoenix.com/.
    [4]Motion Analysis Corporation.http://www.motionanalysis.com/html/movement/eagle4.html.
    [5]Vicon Motion Systems.http://www.vicon.com/.
    [6]Ign-Eyetoy:Antigrav http://bestof.ign.com/2005/ps2/20.html.
    [7]Cmu Graphics Lab Motion Capture Database http://mocap.cs.cmu.edu.
    [8]Half-Life Sierra Entertainment.http://www.sierra.com/landing/home.html.
    [9]Eyetoy:Kinetic.http://rr.ps2.ign.com/rrview/ns2/eyetoy_kinetic/715377/37490/.
    [10]Wii Reviews.http://reviews.cnet.com/consoles/nintendo-wii/4505-10109_7-31355104.html.
    [11]A.Agarwal,B.Triggs.3d Human Pose from Silhouettes by Relevance Vector Regression.in IEEE Conference on Computer Vision and Pattern Recognition.2004.Washington D.C.,USA.
    [12]A.Agarwal,B.Triggs.Recovering 3d Human Pose from Monocular Images.IEEE Transactions on Pattern Analysis and Machine Intelligence,2006.28(1):44-58.
    [13]A.AgarwaI,B.Triggs.Recovering.3d Human Pose from Monocular Images.IEEE Transactions on Pattern Analysis and Machine Intelligence,2006.28(1):44-58
    [14]J.K.Aggarwal,Q.Cai.Human Motion Analysis:A Review Computer Vision and Image Understanding 1999 73(3):428-440
    [15]N.Amenta D.Attali,O.Devillers.Complexity of Delaunay Triangulation for Points on Lower-Dimensional Polyhedra.in ACM-SIAM Symposium on Discrete Algorithms.2007.
    [16]D.Anguelov,E Srinivasan,D.Koller,S.Thrun,J.Rodgers.Scape:Shape Completion and Animation of People.in ACM SIGGraph.2005.Los Angeles,CA,USA
    [17]V.Athitsos,S.Sclaroff.Estimating 3d Hand Pose from a Cluttered Image.in IEEE Conference.on Computer Vision and Pattern Recognition.2003.Madison,WI,USA.
    [18]A.Azarbayejani,C.Wren,A.Pentland.Real-Time 3-D Tracking of the Human Body.in IMAGE'COM.1996.Bordeaux,France.
    [19]A.O.Balan,M.J.Black,H.W.Haussecker,L.Sigal.Shining a Light on Human Pose:On Shadows,Shading and the Estimation of Pose and Shape in IEEE International Conference on Computer Vision.2007.Rio de Janeiro,Brazil.
    [20]A.O.Balan,L.Sigal,M.J.Black,J.E.Davis,H.W.Haussecker.Detailed Human Shape and Pose from Images.in IEEE Conference on Computer Vision and Pattern Recognition.2007.RMinneaapolis,MN,USA.
    [21]C.B.Barber,D.P.Dobkin,H.Huhdanpaa.The Quickhull Algorithm for Convex Hulls.ACM Transactions on Mathematical Software,1996.22(4):469-483
    [22]F.Barrientos.Continuous Control of Avatar Gesture.in ACM International Multimedia Conference.2000.
    [23]H.Bay,T.Tuytelaars,L.J.V.Gool.Surf:Speeded up Robust Features.in European Conference on Computer Vision 2006.Graz,Austria.
    [24]S.Belongie,J.Malik,J.Puzicha.Shape Matching and Object Recognition Using Shape Contexts.IEEE Transactions on Pattern Analysis and Machine Intelligence 2002.24(4):509-522.
    [25]I.Biederman.Recognition-by-Components:A Theory of Human Image Understanding.Psychological Review,1987.94(2):115-147.
    [26]C.M.Bishop,J.Lasserre.Generative or Discriminative? Getting the Best of Both Worlds.in ISBA Eighth World Meeting on Bayesian Statistics 2006.Valencia,Spain
    [27]C.M.Bishop.Pattern Recognition and Machine Learning.Springer.2007.
    [28]A.Bissacco,M.-H.Yang,S.Soatto.Fast Human Pose Estimation Using Appearance and Motion Via Multi-Dimensional Boosting Regression.in IEEE Conference on Computer Vision and Pattern Recognition.2007.Minneapolis,MN,USA.
    [29]A.F.Bobick.Movement,Activity,and Action:The Role of Knowledge in the Perception of Motion.in Royal Society Workshop on Knowledgebased Vision in Man and Machine.1997.London,England.
    [30]A.F.Bobick,J.W.Davis.The Recognition of Human Movement Using Temporal Templates.IEEE Transactions on Pattern Analysis and Machine Intelligence,2001.23(3):257-267.
    [31]O.Boiman,M.Irani.Detecting Irregularities in Images and in Video.in IEEE International Conference on Computer Vision 2005.Beijing,China.
    [32]S.Boltz,E.Debreuve,M.Barlaud.High-Dimensional Statistical Distance for Region-of-Interest Tracking:Application to Combininga Soft Geometric Constraint with Radiometry.in IEEE Conference on Computer Vision and Pattern Recognition.2007.Minneapolis,Minnesota,USA.
    [33]R.Boulic,P.Bécheiraz,L.Emering,D.Thalmann.Integration of Motion Control Techniques for Virtual Human and Avatar Real-Time Animation.in ACM Virtual Reality Software and Technology Conference 1997.
    [34]M.Brand.Shadow Puppetry.in IEEE International Conference on Computer Vision.1999.Corfu,Greece.
    [35]R.Brunelli,T.Poggio.Face Recognition:Features Versus Templates.IEEE Transactions on Pattern Analysis and Machine Intelligence,1993.15(10):1042-1052.
    [36]J.Carranza,C.Theobalt,M.A.Magnor,H.-P.Seidel.Free-Viewpoint Video of Human Actors.in ACM SIGGraph.2003.San Diego,CA,USA.
    [37]A.Cavallaro,E.D.Gelasca,T.Ebrahimi.Objective Evaluation of Segmentation Quality Using Spatio-Temporal Context.in IEEE International Conference on Image Processing.2002.
    [38]J.Chai,J.K.Hodgins.Performance Animation from Low-Dimensional Control Signals.in ACM SIGGraph.2005.Angeles,CA,USA
    [39]O.T.-C.Chen,C.-C.Chen.Automatically-Determined Region of Interest in Jpeg 2000.IEEE Transactions on Multimedia,2007.9(7):1333-1345.
    [40]E.Chiavaccini,G.Vitetta.Map Symbol Estimation on Frequency-Flat Rayleigh Fading Channels Via a Bayesian Em Algorithm.IEEE Transactions on Communications,2001.49(11):1057-1061.
    [41]C.Christensen,S.Corneliussen,Visualization of Human Motion Using Model-Based Vision,in Technical report,Laboratory of Image Analysis.1997,Aalborg University.
    [42]D.Comaniciu,P.Meer.Mean Shift:A Robust Approach toward Feature Space Analysis IEEE Transaction on Pattern Analysis and Machine Intelligence,2002.24(5):603-619.
    [43]P.Correia,F.Pereira.Objective Evaluation of Relative Segmentation Quality.in IEEE International Conference on Image Processing 2000.
    [44]N.Dalal,B.Triggs.Histograms of Oriented Gradients for Human Detection.in IEEE Conference on Computer Vision and Pattern Recognition 2005.San Diego,CA,USA.
    [45]J.W.Davis,A.F.Bobick.The Representation and Recognition of Action Using Temporal Templates.in IEEE Conference on Computer Vision and Pattern Recognition.1997.San Juan,Puerto Rico.
    [46]Q.Delamarre,O.Fangeras.3d Articulated Models and Multi-View Tracking with Physical Forces.Computer Vision and Image Understanding,2001.81(3):328-357
    [47]D.Demirdjian,T.Ko,T.Darrell.Untethered Gesture Acquisition and Recognition for Virtual World Virtual Reality 2005.8(4):222-230
    [48]A.Dempster,N.Laird,D.Rubin.Maximum Likelihood from Incomplete Data Via the Em Algorithm.Journal of the Royal Statistical Society.Series B(Methodological),1977.39(1):1-38.
    [49]J.Deutscher,A.Blake,I.Reid.Articulated Body Motion Capture by Annealed Particle Filtering.in IEEE Conference on Computer Vision and Pattern Recognition.2000.Hilton Head Island,SC,USA.
    [50]J.J.DiCarlo,D.D.Cox.Untangling Invariant Object Recognition.Trends in Cognitive Sciences,2007.11(8):333-340.
    [51]P.Dollár,V.Rabaud,G.Cottrell,S.Belongie.Behavior Recognition Via Sparse Spatio-Temporal Features.in IEEE Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.2005.Beijing,.China.
    [52]Z.Duric,W.D.Gray,R.Heishman,F.Li,A.Rosenfeld,M.J.Schoelles,C.Schunn,H.Wechsler.Integrating Perceptual and Cognitive Modeling for Adaptive and Intelligent Huma-Computer Interaction.Proceedings of the IEEE,2002.90(7):1272-1289.
    [53]A.A.Efros,A.C.Berg,G.Mori,J.Malik.Recognizing Action at a Distance.in IEEE International Conference on Computer Vision.2003.Nice,France.
    [54]J.Eisenstein,R.Davis.Natural Gesture in Descriptive Monologues.in ACM Symposium on User Interface Software and Techology 2003.New York,NY,USA.
    [55]J.Eisenstein,W.E.Mackay.Interacting with Communication Appliances:An Evaluation of Two Computer Vision Based Selection Techniques.in ACM Conference on Human Factors in Computing Systems(CHI).2006.Montréal,Québec,Canada.
    [56]A.Elgammal,C.-S.Lee.Inferring 3d Body Pose from Silhouettes Using Activity Manifold Learning.in IEEE Conference on Computer Vision and Pattern Recognition.2004.Washington D.C.,USA.
    [57]C.E.Erdem,A.M.Tekalp,B.Sankur.Metrics for Performance Evaluation of Video Object Segmentation in IEEE International Conference on Image Processing.2001.Greece.
    [58]C.E.Erdem,B.Sankur.Performance Evaluation Metrics for Object-Based Video Segmentation.in European Signal Processing Conference 2000.
    [59]M.Everingham,H.L.Muller,B.T.Thomas.Evaluating Image Segmentation Algorithms Using the Pareto Front.in European Conference on Computer Vision 2002.
    [60]M.W.Eysenck,M.T.Keane.Cognitive Psychology:A Student's Handbook(5th Edition):A Student's Handbook.Psychology Press Ltd 2005.
    [61]A.m.Farahmand,C.Szepesvari,J.-Y.Audibert.Manifold-Adaptive Dimension Estimation in International conference on Machine Learning.2007.Oregon,USA.
    [62]P.F.Felzenszwalb,D.P.Huttenlocher.Efficient Matching of Pictorial Structures.in IEEE Conference on Computer Vision and Pattern Recognition.2000.Hilton Head Island,SC,USA.
    [63]P.F.Felzenszwalb,D.P.Huttenlocher.Pictorial Structures for Object Recognition. International Journal of Computer Vision,2005.61(1):55-79.
    [64]J.Flusser,T.Suk.Pattern Recognition by Affine Moment Invariants.Pattern Recognition,1993.26(1):167-174.
    [65]W.Forstner,E.Gulch.A Fast Operator for Detection and Precise Location of Distinct Points,Corners and Centers of Circular Features.in ISPRS.1987.
    [66]D.A.Forsyth,O.Arikan,L.Ikemoto,J.O'Brien,D.Ramanan.Computational Studies of Human Motion:Part 1,Tracking and Motion Synthesis.Foundations and Trends in Computer Graphics and Vision,2005.1(2-3):77-254.
    [67]A.R.J.Francois,G.G.Medioni.Adaptive Color Background Modeling for Real-Time Segmentation of Video Streams.in International Conference on Imaging Science,Systems,and Technology.1999.Las Vegas,NV,USA.
    [68]J.Francik,A.Szarowicz.Integrate and Conquer - the Next Generation of Intelligent Avatars.in ACM SIGCHI International Conference on Advances in Computer Entertainment Technology.2005.Valencia,Spain.
    [69]W.T.Freeman,E.H.Adelson.The Design and Use of Steerable Filters.IEEE Transactions on Pattern Analysis and Machine Intelligence,1991.13(9):891-906.
    [70]D.Gatica-Perez,M.-T.Sun,C.Gu.Multiview Extensive Partition Operators for Semantic Video Object Extraction IEEE Transactions on Circuits and Systems for Video Technology,2001.11(7):788-801.
    [71]D.Gatica-Perez,C.Gu,M.-T.Sun.Semantic Video Object Extraction Using Four-Band Watershed Andpartition Lattice Operators.IEEE Transactions on Circuits and Systems for Video Technology,2001 11(5):603-618.
    [72]D.Gavrila,V.Philomin.Real-Time Object Detection for Smart Vehicles.in IEEE International Conference on Computer Vision.1999.Kerkyra,Corfu,Greece.
    [73]D.M.Gavrila.The Visual Analysis of Human Movement:A Survey.Computer Vision and Image Understanding,1999.73(1):82-98.
    [74]T.Gevers,A.W.M.Smeulders,H.Stokman.Photometric Invariant Region Detection.in British Machine Vision Conference.1998.Southampton,UK.
    [75]I.Giakoumis,I.Pitas.Digital Restoration of Painting Cracks.in IEEE International Symposium on Circuits and Systems.1998.Monterey,California,USA.
    [76]L.J.V.Gool,T.Moons,D.Ungnreanu.Affine/Photometric Invariants for Planar Intensity Patterns.in European Conference on Computer Vision.1996.Cambridge,UK.
    [77]F.S.Grassia.Practical Parameterization of Rotations Using the Exponential Map.Journal of Graphics Tools,1998.3(3):29-48
    [78]C.Gu,M.-C.Lee.Tracking of Multiple Semantic Video Objects for Internet Applications.in SPIE Visual Communications and Image Processing.1999.
    [79]L.Gu,T.Kanade.3d Alignment of Face in a Single Image in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [80]P.Hamalainen,J.Hoysniemi,T.Ilmonen,M.Lindholm,A.Nykanen.Martial Arts in Artificial Reality in ACM Conference on Conference on Human Factors in Computing Systems 2005.Portland,Oregon,USA.
    [81]Hoysniemi,P.Hamalainen,L.Turkki,T.Rouvi.Children's Intuitive Gestures in Vision-Based Action Games.Communications of the ACM,2005.48(1):44-50.
    [82]I.Haritaoglu,D.Harwood,L.S.Davis.W4:Who? When? Where? What? A Real Time System for Detecting and Tracking People IEEE Transaction on Pattern Analysis and Machine Intelligence,2000.22(8):809-830.
    [83]C.Harris,M.Stephens.A Combined Corner and Edge Detector.in Alvey Vision Conference.1988.
    [84]B.Hewett,Card,Carey,Gasen,Mantei,Perlman,Strong and Verplank The Content of Human-Computer Interaction http://www.sigchi.org/cdg/cdg2.html#2_3.
    [85]T.Horprasert,D.Harwood,L.S.Davis.A Statistical Approach for Real-Time Robustbackground Subtraction and Shadow Detection.in IEEE International Conference on Computer Vision,Frame Rate Workshop 1999.Kerkyra,Greece.
    [86]N.R.Howe,M.E.Leventon,W.T.Freeman.Bayesian Reconstruction of 3d Human Motion from Single-Camera Video.in Advances in Neural Information Processing Systems.2000.
    [87]N.R.Howe.Silhouette Lookup for Monocular 3d Pose Tracking.Image and Vision Computing,2007.25(3).
    [88]M.-K.Hu.Visual Pattern Recognition by Moment Invariants.IEEE Transactions on Information Theory,1962.8(2):179-187.
    [89]E.Huber.3-D Real-Time Gesture Recognition Using Proximity Spaces.in IEEE International Conference on Pattern Recognition.1996.Vienna,Austria.
    [90]S.Ioffe,D.Forsyth.Human Tracking with Mixtures of Trees.in IEEE International Conference on Computer Vision.2001.Vancouver,Canada.
    [91]S.Ioffe,D.A.Forsyth.Probabilistic Methods for Finding People.International Journal of Computer Vision,2001.43(1):45-68.
    [92]L.Itti,C.Koch,E.Niebur.A Model of Saliency-Based Visual Attention for Rapid Scene Analysis.IEEE Transactions on Pattern Analysis and Machine Intelligence,1998.20(11):1254-1259.
    [93]S.Iwasawa,J.Ohya,K.Takahashi,T.Sakaguchi,S.Kawato,K.Ebihara,S.Morishima.Real-Time,3d Estimation of Human Body Postures from Trinocular Images.in IEEE International Workshop on Modelling People 1999.
    [94]S.Jayaram,S.Schmugge,M.C.Shin,L.V.Tsap.Effect of Colorspace Transformation,the Illuminance Component,and Color Modeling on Skin Detection.in IEEE Conference on Computer Vision and Pattern Recognition 2004.Washington,DC,USA.
    [95]H.Jhuang,T.Serre,L.Wolf,T.Poggio.A Biologically Inspired System for Action Recognition.in IEEE International Conference on Computer Vision 2007.
    [96]K.Jiang,Q.Liao,Y.Xiong.A Novel White Blood Cell Segmentation Scheme Based on Feature Space Clustering.Soft Computing,2006.10(1):12-19.
    [97]G.Johansson.Visual Perception of Biological Motion and a Model for Its Analysis.Perception and Psychophysics,1973.14(2):201-211.
    [98]G.Johansson.Visual Motion Perception.Scientific American,1975.232(6):76-89.
    [99]S.Ju,M.J.Black,Y.Yacoob.Cardboard People:A Parameterized Model of Articulated Image Motion in IEEE International Conference on Automatic Face and Gesture Recognition.1996.Killington,Vermont,USA.
    [100]I.Kakadiaris,D.Metaxas.Vision-Based Animation of Digital Humans.in Computer Animation Conference 1998.
    [101]A.Kanaujia,C.Sminchisescu,D.N.Metaxas.Semi-Supervised Hierarchical Models for 3d Human Pose Reconstruction.in IEEE Conference on Computer Vision and Pattern Recognition. 2007.Minneapolis,MN,USA.
    [102]Y.Ke,R.Sukthankar,M.Hebert.Efficient Visual Event Detection Using Volumetric Features.in IEEE International Conference on Computer Vision.2005.Beijing,China.
    [103]B.M.Kelm,C.Pal,A.McCallum.Combining Generative and Discriminative Methods for Pixel Classification with Multi-Conditional Learning.in International Conference on Pattern Recognition.2006.Hong Kong,China.
    [104]J.Kleindienst,T.Macek,L.Serédi,J.Sedivy.Interaction Framework for Home Environment Using Speech and Vision.Image and Vision Computing.2007.25(12):1836-1847
    [105]J.J.Koenderink,A.J.v.Doorn.Representation of Local Geometry in the Visual System.Biological Cybernetics,1987.55:367-375.
    [106]M.Kolabdouzan,C.Shahabi.Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases.in International Conference on Very Large Data Bases 2004.Toronto,Canada
    [107]L.Kovacs,T.Sziranyi.Focus Area Extraction by Blind Deconvolution for Defining Regions of Interest.IEEE Transactions on Pattern Analysis and Machine Intelligence,2007.29(6):1080-1085.
    [108]I.Laptev,T.Lindeberg.Space-Time Interest Points.in IEEE International Conference on Computer Vision.2003.Nice,France.
    [109]N.D.Lawrence,A.J.Moore.Hierarchical Gaussian Process Latent Variable Models in International conference on Machine Learning.2007.Oregon,USA.
    [110]M.W.Lee,R.Nevatia.Human Pose Tracking Using Multi-Level Structured Models.in European Conference on Computer Vision.2006.Graz,Austria.
    [111]S.-Y.Lee,I.-J.Kim,S.C.Ahn,M.-T.Lim,H.-G.Kim.Intelligent 3d Video Avatar for Immersive Telecommunication.Lecture Notes in Computer Science,2005.3809:726-735.
    [112]R.Li,M.-H.Yang,S.Sclaroff,T.-P.Tian.Monocular Tracking of 3d Human Motion with a Coordinated Mixture of Factor Analyzers in European Conference on Computer Vision.2006.Graz,Austria.
    [113]J.M.Linebarger,G.D.Kessler,The Effect of Avatar Connectedness on Task Performance in Lehigh University Technique Report.2002.
    [114]D.G.Lowe.Object Recognition from Local Scale-Invariant Features.in IEEE International Conference on Computer Vision.1999.Kerkyra,Corfu,Greece.
    [115]G.Loy,M.Eriksson,J.Sullivan,S.Carlsson.Monocular 3d Reconstruction of Human Motion in Long Action Sequences.in European Conference on Computer Vision.2004.Prague,Czech Republic.
    [116]H.Luo,A.Eleftheriadis.Designing an Interactive Tool for Video Object Segmentation and Annotation.in ACM Multimedia 1999.
    [117]T.Meier,K.N.Ngan.Automatic Segmentation of Moving Objects for Video Object Plane Generat IEEE Transactions on Circuits and Systems for Video Technology,1998.8(5):525-538.
    [118]D.Meyer,J.Denzler,H.Niemann.Model Based Extraction of Articulated Objects in Image Sequences for Gait Analysis.in IEEE International Conference on Image Processing.1997.
    [119]E.Meyers,L.Wolf.Using Biologically Inspired Features for Face Processing International Journal of Computer Vision,2008.76(1):93-104.
    [120]A.S.Micilotta,E.-J.Ong,R.Bowden.Detection and Tracking of Humans by Probabilistic Body Part Assembly.in British Machine Vision Conference.2005.Oxford,UK.
    [121]K.Mikolajczyk,C.Schmid.A Performance Evaluation of Local Descriptors.IEEE Transactions on Pattern Analysis and Machine Intelligence,2004.27(10):1615-1630.
    [122]K.Mikolajczyk,C.Schmid,A.Zisserman.Human Detection Based on a Probabilistic Assembly of Robust Part Detectors.in European Conference on Computer Vision.2004.Prague,Czech Republic
    [123]F.Mindru,T.Moons,L.J.V.Gool.Recognizing Color Patterns Irrespective of Viewpoint and Illumination.in IEEE Conference on Computer Vision and Pattern Recognition.1999.Fort Collins,CO,USA.
    [124]F.Mindru,T.Tuytelaars,L.J.V.Gool,T.Moons.Moment Invariants for Recognition under Changing Viewpoint and Illumination.Computer Vision and Image Understanding 2004.94(1-3):3-27.
    [125]J.Mitchelson,A.Hilton.Hierarchical Tracking of Multiple People.in British Machine Vision Conference.2003.Norwich,UK.
    [126]T.B.Moeslund,E.Granum.A Survey of Computer Vision-Based Human Motion Capture.Computer Vision and Image Understanding 2001.81(3):231-268
    [127]T.B.Moeslund,A.Hilton,V.Kruger.A Survey of Advances in Vision-Based Human Motion Capture and Analysis.Computer Vision and Image Un.derstanding,2006.104(2):90-126.
    [128]B.Moghaddam,A.Pentland.Beyond Euclidean Eigenspaces:Bayesian Matching for Visual Recognition Face Recognition.:From Theories to Applications,1998.
    [129]K.Moon,V.Pavlovic.Impact.of Dynamics on Subspace Embedding and Tracking of Sequences.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [130]G.Mori,X.Ren,A.A.Efros,J.Malik.Recovering Human Body Configurations:Combining Segmentation and Recognition.in IEEE Conference on Computer Vision and Pattern Recognition.2004.Washington D.C.,USA.
    [131]G.Mori,J.Malik.Recovering 3d Human Body Configurations Using Shape Contexts.IEEE Transactions on Pattern Analysis and Machine Intelligence,2006.28(7):1052-1062.
    [132]E.N.Mortensen,W.A.Barrett.Interactive Segmentation with Intelligent Scissors.Graphical Models and Image Processing,1998.60(5):349-384.
    [133]R.Navaratnam,A.Thayananthan,P.H.S.Torr.Hierarchical Part-Based Human Body Pose Estimation.in British Machine Vision Conference.2005.Oxford,UK.
    [134]J.C.Niebles,H.Wang,F.-F.Li.Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words.in British Machine Vision Conference 2006.Edinburgh.
    [135]J.Nilsson,F.Sha,M.I.Jordan.Regression on Manifolds Using Kernel Dimension Reduction in International conference on Machine Learning.2007.Oregon,USA.
    [136]S.A.Niyogi,E.H.Adelson.Analyzing and Recognizing Walking Figures in Xyt.in IEEE Conference on Computer Vision and Pattern Recognition 1994.Seattle,WA,USA.
    [137]E.-J.Ong,S.Gong.Tracking Hybrid 2d-3d Human Models from Multiple Views.in IEEE International Workshop on Modelling People.1999.Corfu,Greece.
    [138]E.-J.Ong,A.Hilton,A.S.Micilotta.Viewpoint Invariant Exemplar-Based 3d Human Tracking.in IEEE Workshop on Modeling People and Human Interaction.2005.Beijing,China.
    [139]E.-J.Ong,A.S.Micilotta,R.Bowden,A.Hilton.Viewpoint Invariant Exemplar-Based 3d Human Tracking.Computer Vision and Image Understanding,2006.104(2):178-189
    [140]M.Pantic,A.Pentland,A.Nijholt,T.Huang.Human Computing and Machine Understanding of Human Behavior:A Survey.in IEEE International Conference on Multimodal Interfaces.2006.Banff,Alberta,CA.
    [141]C.Papageorgiou,T.Poggio.A Trainable System for Object Detection.International Journal of Computer Vision,2000.38(1):15-33
    [142]J.Perales,J.Torres.A System for Human Motion Matching between Synthetic and Real Image Based on a Biomechanic Graphical Model.in IEEE Workshop on Motion of Non-Rigid and Articulated Objects.1994.Austin.
    [143]R.W.Picard.Affective Computing.MITPress Cambridge,MA.1998.
    [144]R.Plankers,P.Fua.Articulated Soft Objects for Multiview Shape and Motion Capture Plankers.IEEE Transactions on Pattern Analysis and Machine Intelligence 2003.25(9):1182-1187.
    [145]I.Poddar,Y.Sethi,E.Ozyildiz,R.Sharma.Toward Natural Gesture/Speech Hci:A Case Study of Weather Narration.in Workshop on Perceptual User Interfaces.1998.
    [146]R.Polana,R.Nelson.Recognizing Activities.in IEEE International Conference on Pattern Recognition.1994.
    [147]R.Polana,R.Nelson.Low Level Recognition of Human Motion.in IEEE Workshop on Non-rigid and Articulated Motion.1994.Austin,Texas,USA.
    [148]C.M.Privitera,L.W.Stark.Algorithms for Defining Visual Regions-of-Interest:Comparison Witheye Fixations.IEEE Transactions on Pattern Analysis and Machine Intelligence,2000.22(9):970-982.
    [149]R.Raina,Y.Shen,A.Y.Ng,A.McCallum.Classification with Hybrid Generative/Discriminative Models.in Advances in Neural Information Processing Systems.2003.
    [150]D.Ramanan,D.A.Forsyth,A.Zisserman.Strike a Pose:Tracking People by Finding Stylized Poses.in IEEE Conference on Computer Vision and Pattern Recognition.2005
    [151]D.Ramanan,C.Sminchisescu.Training Deformable Models for Localization.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [152]Z.Rasheed,Y.Sheikh,M.Shah.On the Use of Computable Features for Film Classification.IEEE Transactions on Circuit and Systems for Video Technology,2005.15(1):52-64.
    [153]T.H.Reiss.Recognizing Planar Objects Using Invariant Image Features.Springer-Verlag New York,Inc..1993.
    [154]X.Ren,A.C.Berg,J.Malik.Recovering Human Body Configurations Using Pairwise Constraints between Parts.in IEEE International Conference on Computer Vision 2005.Beijing,China.
    [155]C.Ridder,O.Munkelt,H.Kirchner.Adaptive Background Estimation and Foreground Detection Using Kalman-Filtering.in International Conference on recent Advances in Mechatronics.1995.Istanbul,Turkey.
    [156]E.S.gistad,P.N.Yianilos.Towards Em-Style Algorithms for a Posteriori Optimization of Normal Mixtures.in IEEE international symposium on information theory 1998.
    [157]T.J.Roberts,S.J.McKenna,I.W.Ricketts.Adaptive Learning of Statistical Appearance Models for 3d Human Tracking.in British Machine Vision Conference.2002.Cardiff,UK.
    [158]T.J.Roberts,SJ.McKenna,I.W.Ricketts.Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations.in European Conference on Computer Vision.2004.Prague,Czech Republic.
    [159]K.Roht.Human Movement Analysis Based on Explicit Motion Models M.Shah and R.Jain Eds,Motion Based Recognition.Vol.8.Kluwer academic publishers.1997.
    [160]R.Ronfard,C.Schmid,B.Triggs.Learning to Parse Pictures of People.in European Conference on Computer Vision.2002.Copenhagen,Denmark.
    [161]D.A.Ross,R.S.Zemel.Learning Parts-Based Representations of Data.Journal of Machine Learning Research,2006.7:2369-2397.
    [162]M.Rossi,A.Bozzoli.Tracking and Counting Moving People in IEEE International Conference on Image Processing.1994.
    [163]S.T.Roweis,L.K.Saul.Nonlinear Dimensionality Reduction by Locally Linear Embedding.Science,2000.290(5500):2323-2326.
    [164]M.W.Schwarz,W.B.Cowan,J.C.Beatty.An Experimental Comparison of Rgb,Yiq,Lab.Hsv,and Opponent Color Models.ACM Transactions on Graphics,1987.6(2):123-158.
    [165]H.Segawa,H.Shioya,N.Hiraki,T.Totsuka.Constraint-Conscious Smoothing Framework for the Recovery of 3d Articulated Motion from Image Sequences.in IEEE International Conference on Automatic Face and Gesture Recognition 2000.Grenoble,France.
    [166]G.Shakhnarovich,P.Viola,T.Darrell.Fast Pose Estimation with Parameter-Sensitive Hashing.in IEEE International Conference on Computer Vision.2003.Nice,France.
    [167]J.-C.Shim,C.Dorai.A Generalized Region Labeling Algorithm for Image Coding,Restoration,and Segmentation.in IEEE International Conference on Image Processing.1999.Kobe,Japan.
    [168]B.Shneiderman,P.Maes.Direct Manipulation Vs.Interface Agents:A Debate.Interactions,1997.4(6):643-661.
    [169]H.Sidenbladh,M.J.Black,D.J.Fleet.Stochastic Tracking of 3d Human Figures Using 2d Image Motion.in European Conference on Computer Vision.2000.Dublin,Ireland.
    [170]H.Sidenbladh,M.J.Black.Learning Image Statistics for Bayesian Tracking.in IEEE International Conference on Computer Vision.2001.Vancouver,Canada.
    [171]H.Sidenbladh,M.J.Black,L.Sigal.Implicit Probabilistic Models of Human Motion for Synthesis and Tracking.in European Conference on Computer Vision.2002.Copenhagen,Denmark.
    [172]L.Sigal,S.Sclaroff,V.Athitsos.Skin Color-Based Video Segmentation under Time-Varying Illumination.IEEE Transactions on Pattern Analysis and Machine Intelligence,2004.26(7):862-877.
    [173]L.Sigal,M.J.Black.Measure Locally,Reason Globally:Occlusion-Sensitive Articulated Pose Estimation.in IEEE Conference on Computer Vision and Pattern Recognition 2006.New York,NY,USA.
    [174]L.Sigal,A.Balan,M.J.Black.Combined Discriminative and Generative Articulated Pose and Non-Rigid Shape Estimation.in Advances in Neural Information Processing Systems.2007.Vancouver,B.C.,Canada.
    [175]C.Sminchisescu,B.Triggs.Covariance Scaled Sampling for Monocular 3d Body Tracking.in IEEE Conference on Computer Vision and Pattern Recognition.2001.Kauai Marriott,Hawaii.
    [176]C.Sminchisescu.Consistency and Coupling in Human Model Likelihoods.in IEEE International Conference on Automatic Face and Gesture Recognition 2002.Washington,D.C.,USA.
    [177]C.Sminchisescu,B.Triggs.Kinematic Jump Processes for Monocular 3d Human Tracking.in IEEE Conference on Computer Vision and Pattern Recognition.2003.Madison,WI,USA.
    [178]C.Sminchisescu,A.D.Jepson.Generative Modeling for Continuous Non-Linearly Embedded Visual Inference.in International Conference on Machine Learning.2004.Banff,Alberta,Canada.
    [179]C.Sminchisescu,A.Kanaujia,Z.Li,D.N.Metaxas.Discriminative Density Propagation for 3d Human Motion Estimation.in IEEE Conference on Computer Vision and Pattern Recognition.2005.San Diego,CA,USA.
    [180]C.Sminchisescu,A.Kanaujia,Z.Li,D.N.Metaxas.Conditional Visual Tracking in Kernel Space.in Advances in Neural Information Processing Systems.2005.Vancouver,Canada.
    [181]C.Sminchisescu,A.Kananjia,D.N.Metaxas.Learning Joint Top-Down and Bottom-up Processes for 3d Visual Inference.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [182]A.R.Smith.Color Gamut Transform Pairs.Computer Graphics,1978.12(3):12-19.
    [183]S.Sural,G.Qian,S.Pramanik.Segmentation and Histogram Generation Using the Hsv Color Space for Image Retrieval.in IEEE International Conference on Image Processing 2002.Rochester,New York,USA.
    [184]L.Taycher,G.Shakhnarovich,D.Demirdjian,T.Darrell.Conditional Random People:Tracking Humans with Crfs and Grid Filters.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [185]C.J.Taylor.Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image.Computer Vision and Image Understanding,2000.80(3):349-363.
    [186]J.B.Tenenbaum,V.d.Silva,J.C.Langford.A Global Geometric Framework for Nonlinear Dimensionality Reduction.Science,2000.290:2319-2323.
    [187]K.Tollmar,D.Demirdjian,T.Darrell.Gesture + Play:Full Body Interaction for Virtual Environments.in CHI Extended Abstracts 2003.Fort Lauderdale,Florida,USA.
    [188]M.Turk,A.Pentland.Face Recognition Using Eigenfaces.in IEEE Conference on Computer Vision and Pattern Recognition.1991.Maui,HI,USA.
    [189]M.Turk.Computer Vision in the Interface.Communications of the ACM,2004.47(1):60-67.
    [190]R.Urtasun,D.J.Fleet,P.Fua.Monocular 3d Tracking of the Golf Swing.in IEEE Conference on Computer Vision and Pattern Recognition 2005.San Diego,CA,USA.
    [191]R.Urtasun,D.J.Fleet,A.Hertzmann,P.Fua.Priors for People Tracking from Small Training Sets.in IEEE International Conference on Computer Vision 2005.Beijing,China.
    [192]R.Urtasun,D.J.Fleet,P.Fua.3d People Tracking with Gaussian Process Dynamical Models.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [193]R.Urtasun,D.Fleet,T.Darrell,N.D.Lawrence.Topologically-Constrained Latent Variable Models.in International Conference on Machine Learning.2008.
    [194]A.Utsumi,H.Mori,J.Ohya,M.Yachida.Multiple-View-Based Tracking of Multiple Humans.in IEEE International Conference on Pattern Recognition.1998.
    [195]P.Viola,M.Jones.Rapid Object Detection Using a Boosted Cascade of Simple Features.in IEEE Conference on Computer Vision and Pattern Recognition 2001.Kauai,HI,USA.
    [196]P.Viola,M.Jones,D.Snow.Detecting Pedestrians Using Patterns of Motion and Appearance in IEEE International Conference on Computer Vision.2003.Nice,France.
    [197]S.Wachter,H.H.Nagel.Tracking Persons in Monocular Image Sequences.Computer Vision and Image Understanding,1999.74(3):174-192.
    [198]L.Wang,W.Hu,T.Tan.Recent Developments in Human Motion Analysis.Pattern Recognition,2003.36(3):585-601.
    [199]J.A.Webb,J.K.Aggarwal.Structure from Motion of Rigid and Jointed Objects.Artificial Intelligence 1982.19(1):107-130.
    [200]C.R.Wren,A.Azarbayejani,T.Darrell,A.P.Pontland.Pfinder:Real-Time Tracking of the Human Body IEEE Transactions on Pattern Analysis and Machine Intelligence,1997.19(7):780-785.
    [201]R.Xiao,L.Zhu,H.-J.Zhang.Boosting Chain Learning for Object Detection.in IEEE International Conference on Computer Vision 2003.Nice,France.
    [202]M.Yamamoto,K.Koshikawa.Human Motion Analysis Based on a Robot Arm Model in IEEE Conference on Computer Vision and Pattern Recognition.1991.Maui,Hawaii,USA.
    [203]Y.Yusoff,W.Christmas,J.Kittler.Video Shot Cut Detection Using Adaptive Thresholding.in British Machine Vision Conference.2000.Bristol
    [204]B.D.Zarit,B.J.Super,F.K.H.Quek.Comparison of Five Color Models in Skin Pixel Classification.in International Workshop on Recognition,Analysis,and Tracking of Faces and Gestures in Real-time Systems.1999.
    [205]D.-Q.Zhang,S.-F.Chang.A Generative-Discriminative Hybrid Method for Multi-View Object Detection.in IEEE Conference on Computer Vision and Pattern Recognition.2006.New York,NY,USA.
    [206]H.Zhong,L.Wenyin,S.Li.Interactive Tracker - a Semi-Automatic Video Object Tracking and Segmentation System.in IEEE International Conference on Multimedia and Expo.2001.
    [207]E.Zitzler,L.Thiele.Multiobjective Evolutionary Algorithms:A Comparative Case Study and the Strength Pareto Approach.IEEE Transaction on Evolutionary Computation,1999.3(4):257-271.
    [208]王涛,胡事民.基于颜色-空间特征的图像检索.软件学报,2002.13(10):2031-2036.
    [209]史奈德(美).计算机图形学几何具算法详解.电子工业出版社.2005.
    [210]周项敏,王国仁.基于关键维的高维空间划分策略.软件学报,2004.15(9):1361-1374.
    [211]季白杨,陈纯,钱英.视频分割技术的发展.计算机研究与发展,2001.38(1):36-42.
    [212]王亮,胡卫明,谭铁牛.人运动的视觉分析综述.计算机学报,2002.25(3):225-237.
    [213]胡包钢,谭铁牛,王珏.情感计算-计算机科技发展的新课题科学时报(第三版).2000.
    [214]刘晓玲编.视觉神经生理学.人民卫生出版社.2004.
    [215]张丽,李志能.基于阴影检测的hsv空间自适应背景模型的车辆追踪检测.中国图象图形学报:A辑,2003.8(7):778-782.
    [216]陈允杰,张建伟,韦志辉,王平安,夏德深.基于hsv颜色空间的中国虚拟人脑图像自动分割方法.计算机研究与发展,2007.44(12):2036-204.
    [217]马颂德,张正友.计算机视觉-计算理论与算法基础.科学出版社.1998.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700