用户名: 密码: 验证码:
结合聚合通道特征和双树复小波变换的手势识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Gesture recognition based on aggregate channel feature and dual-tree complex wavelet transform
  • 作者:鲍文霞 ; 解栋文 ; 朱明 ; 梁栋
  • 英文作者:Bao Wenxia;Xie Dongwen;Zhu Ming;Liang Dong;College of Electronic Information Engineering,Anhui University;
  • 关键词:聚合通道特征 ; 双树复小波变换(DTCWT) ; 方向梯度直方图(HOG)特征 ; 二值模式(LBP)特征 ; 特征融合 ; 支持向量机(SVM)
  • 英文关键词:aggregate channel feature;;dual-tree complex wavelet transform(DTCWT);;histogram of oriented gradient(HOG) features;;local binary pattern(LBP) features;;feature fusion;;support vector machine(SVM)
  • 中文刊名:ZGTB
  • 英文刊名:Journal of Image and Graphics
  • 机构:安徽大学电子信息工程学院;
  • 出版日期:2019-07-16
  • 出版单位:中国图象图形学报
  • 年:2019
  • 期:v.24;No.279
  • 基金:国家自然科学基金项目(61401001,61501003,61672032)~~
  • 语种:中文;
  • 页:ZGTB201907006
  • 页数:9
  • CN:07
  • ISSN:11-3758/TB
  • 分类号:61-69
摘要
目的针对目前手势识别方法受环境、光线、旋转、缩放、肤色等因素的影响,导致手势识别精度下降的问题,提出一种结合聚合通道特征(ACF)的手势检测和双树复小波变换(DTCWT)的复杂背景下手势识别方法。方法在手势图像预处理过程中引入聚合通道特征,采用Adaboost分类器和非极大值抑制算法(NMS)进行目标手势的检测;利用DTCWT对目标手势图像进行多尺度多方向分解,对高低频系数的每一块分别提取方向梯度直方图(HOG)和局部二值模式(LBP)特征;最后融合各个方向上的高低频特征并通过支持向量机(SVM)进行分类识别。结果选取多个场景、多个对象、不同角度和距离的图像作为训练集,并标注区分前背景,对20种手势进行识别实验,并与传统的肤色检测、HOG特征手势识别、类-Hausdorff距离的手势识别算法进行了实验对比。在任意可承受范围内的光照、距离等情况下,该方法能够更准确实时地实现手势识别,平均精度达到95. 1%。结论在图像预处理的情况下,聚合通道特征的引入能够准确检测手势,同时基于DTCWT的手势图像频域特征提取和再融合的方法有效地解决了传统普通图像的单特征识别方法在光线和复杂背景下识别精度不高的问题。
        Objective With the continuous development of today's society,people's yearning for an improved life and the level of demand for material life are constantly improving. People are bringing a highly convenient lifestyle with this improved technological development. Human-computer interaction plays an increasingly important role in people and computer's life and becomes a powerful tool for people to work,live,or play. Traditional human-computer interaction devices,such as keyboards,mouses,and touch screens,restrict people's use and limit their imagination because they can accurately operate. Therefore,the research direction for studying gesture recognition on the basis of images or video streams is important. Gestures are more natural and flexible than traditional I/O devices,thereby rendering gesture recognition technology a major research topic. Numerous methods are used to process input images or videos through several techniques,such as machine learning and image processing,for achieving real-time gesture interaction. This method is a research development in the computer vision field. The categories corresponding to the gestures are analyzed by detecting the hand feature information of these objects in the extracted image or video stream,thereby providing corresponding technical support for these fields. In some cases,the human body background in the scene is complex and diverse. The image light,distance,and angle of the hand introduced into the camera are diverse due to human arbitrariness. Thus,the study of gesture recognition in complex environment has become highly important. The current gesture recognition method is affected by the environment,light,rotation,zoom,and skin color,resulting in low accuracy and speed of gesture recognition. Thus,a gesture detection and dual-tree complex wavelet transform( DTCWT) combined with aggregate channel feature( ACF) is proposed to solve such problem. A gesture recognition method is used in complex background with complicated frequency domain feature extraction. The aggregation channel feature includes 10 image channels,and the pixel features of each channel are processed,filtered,and fused to obtain an ACF. Method During gesture image preprocessing,a gesture target detection method using multi-channel feature fusion is introduced as the basic process of gesture recognition. Adaboost classifier and non-maximum suppression algorithm are used to detect target gestures. DTCWT processing is performed on the target gesture image intercepted after the target detection. Multiscale multi-directional decomposition is performed to obtain high and low frequency coefficients. Gradient histogram( HOG) and local binary pattern( LBP) features are extracted for each block of high and low frequency coefficients,respectively. Finally,the features of high-low frequency fusion are classified and identified by the support vector machine training model. Therefore,the identification problem is divided into two stages. The first stage detects the target area and deletes the background area,which significantly improves the efficiency of gesture recognition and paves the way for accurate classification in the second stage. Result Images of multiple scenes and objects and different angles and distances were selected as the training set,and the front background was distinguished. A total of 20 types of gestures were identified and compared with traditional skin color detection,HOG feature gesture recognition,and class-Hausdorff distance. The gesture recognition algorithm was experimentally compared. For the illumination and distance in any acceptable range,the method can accurately realize gesture recognition in real time,and the average precision reaches 95. 1%. Conclusion This algorithm exhibits three advantages. First,the introduced gesture target detection algorithm enables accurate positioning and interception of the hand region even in the case of skin color interference in a complex background. Normalization to a fixed size can solve the problem caused by the gesture occurrence scaling. Second,DTCWT is used to extract the high and low frequency coefficients of the image in the frequency domain and calculate the features on the high and low frequencies,respectively. The influence of light and rotation is eliminated by extracting signal features of different components,decreasing redundancy and feature dimensions,and improving the efficiency of extracting features. Third,DTCWT demonstrates several characteristics,namely,translation invariance,direction selectivity,and a small amount of redundancy. This method exhibits fast calculation speed and less memory,which can effectively achieve real-time purposes. When the gesture area is accurately detected,the proposed algorithm can achieve satisfactory results. In future research work,we will further improve the accuracy of hand detection and classification recognition. The deep learning neural network is used to identify additional datasets and gesture types for solving the small factors that may cause misidentification,obtaining high gesture recognition efficiency,and making gesture recognition highly practical.
引文
[1]Gatteschi V,Lamberti F,Montuschi P,et al.Semantics-based intelligent human-computer interaction[J].IEEE Intelligent Systems,2016,31(4):11-21.[DOI:10.1109/MIS.2015.97]
    [2]Nguyen D D,Le H S.Kinect gesture recognition:SVM vs.RVM[C]//Proceedings of the 7th International Conference on Knowledge and Systems Engineering.Ho Chi Minh City,Vietnam:IEEE,2015:395-400.[DOI:10.1109/KSE.2015.35]
    [3]Xu C,Cheng L.Efficient hand pose estimation from a single depth image[C]//Proceedings of 2013 IEEE International Conference on Computer Vision.Sydney,NSW,Australia:IEEE,2013:3456-3462.[DOI:10.1109/ICCV.2013.429]
    [4]Cai L Q,Zhang J R,Liu B B.Somatosensory interaction with virtual environments based on gesture recognition[J].Journal of Huazhong University of Science and Technology:Natural Science Edition,2015,43(S1):136-139,165.[蔡林沁,张建荣,刘彬彬.基于手势识别的虚拟环境体感交互控制[J].华中科技大学学报:自然科学版,2015,43(S1):136-139,165.][DOI:10.13245/j.hust.15S1033]
    [5]Gong Y C,Wan S,Yang K F,et al.Real-time 3D bare-hand gesture recognition using binocular vision videos[J].Journal of Xidian University:Natural Science Edition,2014,41(4):130-136.[公衍超,万帅,杨楷芳,等.利用双目视觉视频的实时三维裸手手势识别[J].西安电子科技大学学报:自然科学版,2014,41(4):130-136.][DOI:10.3969/j.issn.1001-2400.2014.04.023]
    [6]Suau X,Alcoverro M,López-Méndez A,et al.Real-time fingertip localization conditioned on hand gesture classification[J].Image and Vision Computing,2014,32(8):522-532.[DOI:10.1016/j.imavis.2014.04.015]
    [7]Yang X W,Feng Z Q,Huang Z Z,et al.A gesture recognition algorithm using Hausdorff-like distance template matching based on the main direction of gesture[J].Applied Mechanics and Materials,2015,713-715:2156-2159.[DOI:10.4028/www.scientific.net/AMM.713-715.2156]
    [8]Dardas N H,Georganas N D.Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques[J].IEEE Transactions on Instrumentation and Measurement,2011,60(11):3592-3607.[DOI:10.1109/TIM.2011.2161140]
    [9]Liu S P,Liu Y,Yu J,et al.Hierarchical static hand gesture recognition by combining finger detection and HOG features[J].Journal of Image and Graphics,2015,20(6):781-788.[刘淑萍,刘羽,於俊,等.结合手指检测和HOG特征的分层静态手势识别[J].中国图象图形学报,2015,20(6):781-788.][DOI:10.11834/jig.20150607]
    [10]Chang J Y.Nonparametric feature matching based conditional random fields for gesture recognition from multi-modal video[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(8):1612-1625.[DOI:10.1109/TPAMI.2016.2519021]
    [11]Yang X W,Feng Z Q,Huang Z Z,et al.Gesture recognition based on combining main direction of gesture and Hausdorff-like distance[J].Journal of Computer-Aided Design&Computer Graphics,2016,28(1):75-81.[杨学文,冯志全,黄忠柱,等.结合手势主方向和类-Hausdorff距离的手势识别[J].计算机辅助设计与图形学学报,2016,28(1):75-81.][DOI:10.3969/j.issn.1003-9775.2016.01.010]
    [12]Dollr P,Appel R,Belongie S,et al.Fast feature pyramids for object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(8):1532-1545.[DOI:10.1109/TPAMI.2014.2300479]
    [13]Yang P,Zhang F L,Yang G W.Fusing DTCWT and LBP based features for rotation,illumination and scale invariant texture classification[J].IEEE Access,2018,6:13336-13349.[DOI:10.1109/ACCESS.2018.2797072]
    [14]Wang S H,Zhan T M,Chen Y,et al.Multiple sclerosis detection based on biorthogonal wavelet transform,RBF kernel principal component analysis,and logistic regression[J].IEEE Access,2016,4:7567-7576.[DOI:10.1109/ACCESS.2016.2620996]
    [15]Zhang X Y,Zhang R J.The technology research in decomposition and reconstruction of image based on two-dimensional wavelet transform[C]//Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery.Sichuan,China:IEEE,2012:1998-2000.[DOI:10.1109/FSKD.2012.6234204]
    [16]Sombandith V,Walairacht A,Walairacht S.Hand gesture recognition for Lao alphabet sign language using HOG and correlation[C]//Proceedings of the 14th International Conference on Electrical Engineering/Electronics,Computer,Telecommunications and Information Technology.Phuket,Thailand:IEEE,2017:649-651.[DOI:10.1109/ECTICon.2017.8096321]
    [17]Cirneanu S,Ichim L,Popescu D.Accurate localization of the optic disc based on LBP descriptors[C]//Proceedings of the 40th International Conference on Telecommunications and Signal Processing.Barcelona,Spain:IEEE,2017:678-681.[DOI:10.1109/TSP.2017.8076073]

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700