基于阵列摄像机的深度图获取算法研究

英文题名：Research on Depth Image Estimation Algorithm Based on Camera Array
作者：吴嘉
论文级别：硕士
学科专业名称：信息与通信工程
中文关键词：深度信息获取 ; 摄像机阵列 ; TOF摄像机 ; 摄像机标定 ; 马尔科夫随机场 ; 立体视觉中的不确定性描述
英文关键词：depth map estimation ; camera array ; Time-of-flight camera ; stereo calibration ; MRF ; uncertainty description in stereo vision
学位年度：2010
导师：于慧敏
学科代码：081001
学位授予单位：浙江大学
论文提交日期：2010-01-01
答辩委员会主席：周兆经

摘要

随着信息科学与计算机技术的发展,三维信息获取技术已成为工业检测、生物医学、虚拟现实等领域的关键技术,在机械加工、影视特技制作、高级游戏、文物保护、服装设计、三维通讯、航天遥测、军事侦察等领域有广阔的应用前景。在上述领域中,人们对三维获取技术的要求也越来越高。而目前主要采用的多目立体视觉技术、结构光技术、DFM or SFM(depth from motion or struction frommotion)、DFF和TOF(Time of flight)等三维获取技术,都有各自的优点和缺点,但至今还没有一种技术能很好地满足实时性、高精度、高分辨率和低成本等要求。TOF技术是近几年来迅速发展起来的三维获取技术,其优点是精度高、实时和廉价;缺点是距离较大时会产生深度数据的不确定性,空间分辨率还较低。多目立体视觉是一种应用很广,并已被深入研究的三维获取技术,其主要缺点是立体匹配计算量大、容易产生误匹配,对于阵列摄像机系统,其实现代价非常昂贵。
     本论文主要研究基于阵列摄像机的深度图获取算法,用到的摄像机阵列包含一个TOF摄像机和多个可见光摄像机。算法的主要思想是根据TOF摄像机提供的低分辨率深度图,来得到每个可见光摄像机视点上的高分辨率深度图。在每个可见光摄像机视野下的深度图求取过程中只用到了该可见光摄像机本身的灰度图或者彩色图和3D摄像机的低分辨率深度图以及强度图。
     首先,本文提出一种新的TOF摄像机和可见光摄像机之间的标定方法,该标定方法同时用到了TOF摄像机视野下的强度图和深度图。实验证明本论文中新的标定方法在用于TOF摄像机和普通摄像机之间的标定的时候较传统的摄像机标定方法有着更高的精度。
     然后,本论文根据具体的应用场合的不同提出了一些不同的基于阵列摄像机的深度图获取算法。第一,针对室内简单场景,本文提出了基于误差平方最小化能量函数的实时深度信息获取算法。第二,本文提出了针对室内复杂场景的深度图获取算法,该算法用到了一种新的融合了颜色信息和深度信息的能量函数,该能量函数能够保证在对深度值进行迭代修正的时候尽量的拿到正确的数据,实验证明该算法在复杂场景下有着不错的效果。第三,对上述的复杂场景下算法进行了改进,引入了TOF深度值的不确定性描述和标定参数本身的不确定性描述,并在迭代修正过程中采用了自适应的邻域计算方法。实验证明本论文中算法能够在各自的应用场合下成功的生成摄像机阵列中各个可见光摄像机视野下高质量的高分辨率深度图。
     最后,对本论文中的深度信息获取算法进行了总结,并提出了未来的研究方向。
With the development of information science and computer technology, three-dimensional information acquisition has become the key technology in the areas of Industrial inspection, bio-medicine, virtual reality, etc, and three-dimensional information acquisition technology has a wide application prospect in machining, film special effects production, advanced gaming, conservation, costume design, three-dimensional communication, aerospace remote sensing, military reconnaissance, and so on. In these fields, people's requirements on three-dimensional information acquisition technology are increasingly high. Three-dimensional information acquisition technologies currently used, such as Multi-head stereo vision technology, Structured light technique, DFM or SFM(depth from motion or struction from motion)、DFF and TOF(Time of flight), have their own advantages and disadvantages, and no single technology can satisfy requirements of real-time, high accuracy, high resolution and low cost. TOF technology is developed rapidly in recent years, its advantages are high accuracy, real-time, and low-cost, the disadvantage is that long distance will cause large uncertainty in the depth data, and spatial resolution is still relatively low. Multi-Stereo vision has been deeply studied and widely used. Its main drawbacks are large amount of stereo matching calculations; easily lead to false matches; and for the camera array system, its implementation cost is very expensive.
     This paper mainly studies on the depth map estimation algorithms based on camera array. Camera array used in this paper consists of one TOF camera and several visual cameras. Algorithms proposed in this paper seeks to generate high resolution depth image under the view of each visual camera, in this process only each visual itself and TOF camera are used. Passive stereo is not used in our methods.
     Firstly, a new calibration method between visual camera and Time-of-Flight camera is proposed, in which the intensity image and depth image of TOF camera are both used. In the experimental result, we can find that our new method works better than the traditional calibration method when it is used in the calibration between TOF camera and visual camera.
     Then, for different applications, a number of depth map estimation algorithms based on camera array are proposed in this paper. First, for indoor simple scene, we proposed a real-time depth map estimation algorithm for visual camera. The optimal depth value is calculated by minimizing the squared error. Second, a new depth map estimation algorithms based on camera array for the indoor complex scene is proposed. In this algorithm, the new energy function fuses two different kinds of data: the depth data and the color data, and makes sure that the proper data are always chosen when we are estimating the depth data of the pixels in the refinement procedure. Third, the depth map estimation algorithm in application of indoor complex scene is improved by introducing the uncertainty of TOF depth value and calibration parameters, and adopting the adaptive neighborhood calculating method. Experimental results of all algorithms show that the high resolution depth maps under the view of visual cameras are successfully generated and of high quality
     Finally, we give the summary of the algorithms in this paper and the future researchdirections.

引文

[1] Zhuang, X.; Haralick, R.M.; Zhao,Y.; From depth and optical flow to rigid body motion. Computer Vision and Pattern Recognition, 1988. Proceedings CVPR'88., Computer Society Conference on 5-9 June 1988 Page(s):393-397.
    [2] Berthold K.P. Horn and Brian G. Schunck. Determining Optical Flow. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, U.S.A.
    [3] Hicham Sekkati and Amar Mitiche. Concurrent 3-D Motion Segmentation and 3-D Interpretation of Temporal Sequences of Monocular Images. IEEE Transactions on image processing, Vol.15, NO.3, March 2006.
    [4] Amar Mitiche and Hicham Sekkati. Optical Flow 3D Segmentation and Interpretation: A Variational Method with Active Curve Evolution and Level Sets. IEEE Transactions on Pattern Analysis and Machine Intellgence, VOL.28, NO. 11, November 2006.
    [5] Bosnjak, A.; Montilla, G.; Villegas, R.; Jara, I.; 3D Segmentation with an Application of Level Set-Method using MRI Volumes for Image Guided Surgery. Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29~(th) Annual International Conference of the IEEE, 22-26 Aug.2007 Page(s):5263-5266.
    [6] 张广军编著. 机器视觉,第五章,99-125页. 科学出版社.
    [7] Qingxiong Yang; Liang Wang; Ruigang Yang; Stewenius, H.; Nister, D.; Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation, and Occlusion Handing. Pattern Analysis and Machine Intelligence, IEEE Transactions on Vilume 31, Issue 3, March 2009 Page(s):492-504.
    [8] Jian Sun; Nan-ning Zheng; Heung-Yeung Shum; Stereo matching using belief propagation. Pattern Analysis and Matching Intelligence, IEEE Transactions on Volume 25, Issue 7, July 2003 Page(s):787-800.
    [9] Intae Na; Junghun CHoi; Hong Jeong; Robust Fast Belief Propagation for Real-time Stereo Matching. Advanced Communication Technology, 2009. ICACT 2009. 11~(th) International Conference on Volume 02, 15-18 Feb. 2009 Page(s):1175-1179.
    [10] Carsten Rother, Vladimir Kolmogorov, Andrew Blake. "Grabcut": interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, Volume 23, Issue 3 August 2004.
    [11] Yuri Boykov and Vladimir Kolmogorov. An Experimental Comparison of Min-Cut/Max-flow Algorithms for Energy Minimization in Vision. In IEEE Transactions on PAMI, VOL. 26, No. 9, Page(s):1124-1137. Sept. 20004.
    [12] Vladimir Kolmogorov. Graph-based Algorithms for Muti-camera Reconstruction Problem. PhD thesis, Cornell University, CS Department, 2003.
    [13] Vladimir Kolmogorov and Ramin Zabih. Computing visual correspondence with occlusions via graph cuts. In International Conference on Computer Vision, July 2001.
    [14] Vladimir Kolmogorov and Ramin Zabih. What energy functions can be minimized via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(2): 147-159, Feb. 2004.
    [15] Wei Xiong, Hin Shun Chung, Jiaya Jia. Fractional Stereo Matching Using Expectation-Maximization. IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 31, NO.3, March 2009.
    [16] Wei Xiong and Jiaya Jia. Stereo Matching on Objects with Fractional Boundary. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on 17-22 June 2007 Page(s):l-8.
    [17] Nicole Atzpadin, Peter Kauff, and Oliver Schreer. Stereo Analysis by Hybrid Recursive Matching for Real-Time Immersive Video Conferencing. IEEE Transactions on Circuits and Systems for Video Technology, VOL. 14. NO. 3, March 2004.
    [18] Levoy, M.; Computer, Volume 39, Issue 8, Aug. 2006 Page(s):46-55. Published by the IEEE Computer Society.
    [19] Akira Kubota, Keita Takahashi, Kiyoharu Aizawa, and Tsuhan Chen. All-focused light field rendering. Eurograpgics Symposium on Rendering 2004. H. W. Jensen, A. Keller (Editors).
    [20] Akira Kubota, Aljoscha Smolic, Marcus Magnor, Masayuki Tanimoto, Tsuhan Chen, and Cha Zhang. Multiview Imaging and 3DTV, Special issue overview and introduction. IEEE Signal Processing Magazine [10], Nov. 2007.
    [21] Neel Joshi, Bennett Wilburn, Vaibhav Vaish, Marc Levoy Levoy and Mark Horowitz. Automatic Color Calibration for Large Camera Arrays. CS2005-0821, May 11,2005.
    [22] R.T. Collins. Space-sweep approach to true multi-image matching. In Proc. Of CVPR'96, Page(s):358-363, 1996.

    [23] Swissranger Inc.; sr-2. http://www.csem.ch/fs/imaging.htm, 2006.
    [24] Canesta Inc, canestavision electronic perception development kit. http://www.canesta.com/, 2006.

    [25] 3DV Systems, z-cam. http://www.3dvsystems.com, 2004.
    [26] Schoner, H.; Moser, B.; Dorrington, A.A.; Payne, A.D.; Cree, M.J.; Bauer, F.; A Clustering Based Denoising Technique for Range Images of Time of Flight Cameras. Computational intelligence for Modelling Control&Automation, 2008 International Conference on 10-12 Dec. 2008 Page(s):999-1004.
    [27] Falie, D.; Buzuloiu, V.; Further investigations on TOF cameras distance errors and their corrections. Circuits and Systems for Communications, 2008. ECCSC 2008. 4th European Conference on 10-11 July 2008 Page(s): 197-200.
    [28] Falie, D.; Buzuloiu, V.; Distance errors correction for the Time-of-Flight cameras. Imaging Systems and Techniques, 2008. IEEE International Workshop on 10-12 Sept. 2008 Page(s):123-126.
    [29] Falie, D.; Buzuloiu, V.; Wide range Time of Flight camera for outdoor surveillance. Microwaves, Radar and Remote Sensing Symposium, 2008. MRRS 2008 22-24 Sept. 2008 Page(s):79-82.
    [30] Sigurjon Arni Gu(?)mundsson, Henrik Aan(?)s and Rasmus Larsen. Fusion of Stereo Vision and Time-of-Flight Imaging for Improved 3D Estimation. Int. J. Intelligent Systems Technologies and Applications, VOL. x, No. x, xxxx.
    [31] Jiejie Zhu, Liang Wang, Ruigang Yang, Davis, J. Fusion of time-of-flight depth and stereo for high accuracy depth maps. Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on 23-28 June 2008 Page(s):1-8.
    [32] J. Diebel, S. Thrun. An application of markov random fields to range sensing. In NIPS, 2005.
    [33] Qingxiong Yang, Ruigang Yang, Davis. J and Nister. D. Spatial-Depth Super Resolution for Range Images. Computer Vision and Pattern Recognition, 2007, CVPR'07. IEEE Conference on 17-22 June 2007 Page(s):l-8.
    [34] C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In ICCV, Page(s):839-846, 1998.
    [35] Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Matching Intelligence, 22(11):1330-1334, November 2000.

    [36] Z. Zhang. Microsoft Easy Camera Calibration Tool.
    [37] Guomundsson, S.A.; Aanaes, H.; Larsen, R.; Environment Effects on Measurement Uncertainties of Time-of-Flight Cameras. Signals, Circuits and Systems, 2007. ISSCS 2007. International Symposium on Volume 1, 13-14 July 2007 Page(s):1-4.

    [38] http://www.vision.caltech.edu/bouguetj/calib_doc/.
    [39] NVIDIA CUDA Compute Unified Device Architecture, Reference Manual, Version 2.0.
    [40] NVIDIA CUDA Compute Unified Device Architecture, Programming Guide, Version 2.0.
    [41] Alan Bovik. Handbook of Image and Video Processing, Second Edition, Page(s) 361-374. Publishing House of Electronics Industry.
    [42] Patrick P e rez. Markov Random Fields and Images. CWI Quarterly, Volume 11 (4) 1998, Page(s):413-437.
    [43] R. Chellappa, A.K. Jain, editors (1993). Markov random fields, theory and applications. Academic Press, Boston.
    [44] S.Z. Li (1995). Markov random field modeling in computer vision. Springer- Verlag, Tokyo.
    [45]G.Winkler(1995).Image analysis,random fields and dynamic Monte Carlo methods.Springer,Berlin.
    [46]Ross Kindermann,J.Laurie Snell.Markov Random Fields and Their Applications.Contemporary Mathematics,American Mathematical Society.
    [47]马颂德,张正友著.计算机视觉--计算理论与算法基础,第六章,运动与不确定性表达.科学出版社.
    [48]Chris M.Christoudias,Bogdan Georgescu.Edge Detection and Image Segmentation(EDISON) System ver1.0,www.caip.rutgers.edu/riul/.