基于频域和时域分割的视频对象提取方法研究

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

基于频域和时域分割的视频对象提取方法研究

详细信息本馆镜像全文| 推荐本文 | | 获取CNKI官网全文

作者：杨卫
论文级别：硕士
学科专业名称：信号与信息处理
中文关键词：视频对象 ; 二次B-样条小波 ; 仿射模型 ; 全局运动补偿 ; 光流场 ; 邻域相似程度
英文关键词：Visual object (VO) ; quadratic B-alpine wavelet ; affine modal ; global motion compensation ; optical flow field ; adjacent comparability
学位年度：2003
导师：刘钊
学科代码：081002
学位授予单位：电子科技大学
论文提交日期：2003-03-25

摘要

视频对象的提取是任何基于视频对象的操作如索引、访问等的最基本的步骤。本文提出了一种自动地在频域上提取目标轮廓、在时域上提取运动矢量并结合两者信息的对象提取方法。对三维运动模型、图像的小波变换和光流场的分析等，都作了较为深入的研究和探讨。
    第一章首先介绍了本论文的课题背景及目前多媒体发展的趋势，说明在图像序列中提取视频对象的必要性。接着简要地介绍了当前国内外研究方法及其各自的优缺点，最后提出自己的一套较好的在编解码应用的场合下提取视频对象的方法。
    第二章对本文适用的视频对象模型进行定义及相关说明，并简要介绍了系统的框图，整个算法从频域和时域这两条主线走，然后联合两者的信息提取最终的视频对象。
    第三章集中阐述在频域上提取视频对象信息的算法。先介绍图像小波分解方法的原理、Mallat快速算法、多尺度特性、3阶B-样条小波基函数的选取及其滤波器系数的推导等，然后根据小波变换结果计算梯度矢量矩阵，进行非极大值抑制和双阈值化，提取目标轮廓。最后用与经典的canny边缘检测方法进行比较。
    在第四章中详细介绍了在时域上提取视频对象运动信息的方法。首先构建三维刚体运动的模型，提出一种计算模型的全局运动矢量的方法，并进行全局运动补偿、变化检测模板提取和连通域标记等步骤，然后引入光流场的概念，并介绍其计算原理和方法，用Horn-Schunck迭代法计算图像中各点的局部运动矢量，并据此对变化检测模板的结果进一步提取，获得时域上分割的信息。
    第五章在前两章的基础上，提出一种邻域相似程度的判据来联合频域、时域分割结果提取对象轮廓，最后进行区域生长、数学形态学算子滤波等后期处理获得最终的视频对象。
    第六章用多种标准图像序列测试本文所提出的算法并作相应评价。
    最后一章第七章进行全文总结，并提出改进的方向。
Visual objects (VOs) abstraction is the basic step for all kinds of operation, such as index, accessing, which are based on VOs. This paper brings forward an automatic and efficient method of abstracting VOs. Information of both contour of object based on spacial segmentation and motion vector based on temporal segmentation is integrated to get the final VOs. We have introduced and discussed 3D object motion modal, wavelet transform on graphic, optical flow field, and etc.
    In chapter 1, we first introduce the background of this thesis and the future direction of multi-media development in order to demonstrate the importance of VO abstraction. Then we present current methods in the world in brief, and point out the merits and demerits of every method. At last we put forward an efficient method that is fitted for that work.
    In chapter 2, we define the modal of VOs and confine the available applied field. Then we introduce in brief the whole frame and the algorithm that integrates the information of both temporal and spacial segmentation.
    In chapter 3, we expatiate upon the algorithm that abstracts information of VOs based on spacial segmentation. First we introduce theory and merits of graphic wavelet transform, then Mallat algorithm, multi-scale characteristic, quadratic B-alpine wavelet and the coefficients of this filters, and etc. Later we calculate the gradient matrix based on the result of wavelet transform, thin the contour and get spatical information. At the end of this chapter, we compare it with other method, such as canny filter.
    In chapter 4, we discuss the method of VOs abstraction based on temporal segmentation in detail. First we put forward affine modal, which is a kind of 3-D motion modal of rigid body, compensate global motion vector based on this modal, and get the changed detection mask (CDM). Then we introduce the conception of optical flow field, compute the local

    motion vector with Horn-Schunck method, and abstract the essential information in temporal field.
    In chapter 5, we integrate the information got in chapter 3 & 4 with a criterion of adjacent comparability, then use last operations including seed growing, morphological filters, and etc, to get the final VOs from video sequence. We compare this algorithm of integration with other related algorithms in end of this chapter.
    In chapter 6, we test this whole algorithm with four sets of standard video sequences of MPEG-4, and comment on the result.
    In the last chapter, chapter 7, we summarize this paper and bring out the direction of improvement.

引文

[1] 贾云得，《机器视觉》，科学出版社，2000.4
    [2] 章毓晋，《图像分割》，科学出版社，2001.2
    [3] 崔屹，《图像处理与分析——数学形态学方法及应用》，科学出版社，2000.4
    [4] 崔屹，《数字图像处理技术与应用》，电子工业出版社，1997.3
    [5] 程正兴，《小波分析算法与应用》，西安交通大学出版社，1998.5
    [6] 崔锦泰著，程正兴译，白居先校，《小波分析导论》，西安交通大学出版社，1994
    [7] 钟玉琢，王琪，贺玉文，《基于对象的多媒体数据压缩编码国际标准——MPEG-4及其校验模型》，科学出版社，2000.10
    [8] 钟玉琢，乔秉新等译，《运动图像及其伴音通用编码国际标准——MPEG-2》
    [9] 杨品，钟玉琢，蔡莲红译，《多媒体关键技术规范MPEG运动图像压缩编码标准（ISO/IEC 11172）》
    [10] 沈兰荪，卓力等，《视频编码与低速率传输》，电子工业出版社，2001.12
    [11] 韩军，熊璋等，"分割视频运动对象的研究"，计算机工程与应用，2002，Vol.8，22~26
    [12] ISO/IEC JTC1/SC29/WG11, "Overview of the MPEG-4 standard," MPEG98/N2323, Dublin, Ireland, July 1998.
    [13] Munchurl Kim, Jae Gark Choi, etc. "A VOP Generation Tool: Automatic Segmentation of Moving Objects in Image Sequences Based on Spatio-Temporal Information," IEEE Trans. Circuits & System for Video Technology, vol. 8, pp. 1216-1226, Dec. 1999.
    [14] Thomas Meier, King N. Ngan, "Video segmentation for content-based coding," IEEE Trans. Circuits & System for Video Technology, vol. 9, NO. 8, Dec. 1999.
    [15] P. Salembier, A. Oliveras, and L. Garrido, "Antiextensive connected, operators for image and sequence processing," IEEE Trans. Image, Processing, vol. 7, pp. 555-570, Apr. 1998.
    [16] P. Salembier and M. Pardas, "Hierarchical morphological segmentation, for image sequence coding," IEEE Trans. Image Processing, vol. 3, no.5, pp. 639-651, 1994.
    [17] J. G. Choi, M. Kim, M. H. Lee, and C. Ahn, "Automatic segmentation, based on spatio-temporal information," ISO/IEC, JTC1/SC29/WG11, MPEG97/m2091, Bristol, U.K., Apr. 1997.
    [18] T. Meier and K. N. Ngan, "Automatic segmentation of moving objects, for video object plane generation," IEEE Trans. Circuits Syst. Video, Technol., vol. 8, pp. 525-538, Sept. 1998.
    T. Meier and K.N. Ngan, "Automatic segmentation of moving objects for video objects plane generation," IEEE Transactions on Circuits and Systems on

    [19] Video Technology, U.S.A., Vol. 8, No. 5, September 1998, pp. 525-538.
    [20] M. Hotter and R. Thoma, "Image segmentation based on object oriented, mapping parameter estimation," Signal Processing, vol. 15, pp. 315-334,1988.
    [21] H. G. Musmann, M. Hotter, and J. Ostermann, "Object-oriented, analysis-synthesis coding of moving images," Signal Processing: Image,Commun., vol. 1, pp. 117-138, 1989.
    [22] J. G. Choi, S.-W. Lee, and S.-D. Kim, "Spatio-temporal video segmentation,",IEEE Trans. Circuits Syst. Video Technol., vol. 7, pp. 279–286,Apr. 1997.
    [23] M. Biering, "Displacement by hierarchical block matching," in Proc. SPIE Visual Communications and Image Processing (VCIP'88), Cambridge, MA, Nov. 1988, vol. 1001, pp. 942-951.
    [24] D. W. Murray and B. F. Buxton, "Scene segmentation from visual, motion using global optimization," IEEE Trans. Pattern Anal. Machine, Intell., vol. PAMI-9, pp. 220-228, Mar. 1987.
    [25] N. Diehl, "Object-oriented motion estimation and segmentation in image, sequences," Signal Processing: Image Commun., vol. 3, pp. 23-56, 1991.
    [26] A. Neri, S. Colonnese, G. Russo, and P. Talone, "Automatic moving, object and background separation," Signal Processing, vol. 66, no. 2,pp. 219-232, 1998.
    [27] T. Meier and K. N. Ngan, "Automatic video sequence segmentation, using object tracking," in IEEE Tencon'97, Brisbane, Australia, Dec.,1997, vol. 1, pp. 283-286.
    [28] L. Vincent and P. Soille, "Watershed in digital spaces: An efficient algorithm based on immersion simulations," IEEE Trans. Pattern Anal. Machine Intell., vol. 13, pp. 583-598, June 1991.
    [29] Rioul O. A Discrete-Time Multiresolution Theory[J], IEEE Trans. On SP., 1999, 41(12), 725-743
    [30] Mallat S. A theory for multiresolution signal decomposition, The wavelet representation. IEEE Trans PAMI, 1989, PAMI-11(7)
    [31] Mallat S. Hwang W L. Singularity Detection and Processing with Wavelets[J], IEEE Trans, 1992, 38(2), 617-643

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700