用户名: 密码: 验证码:
Estimate Hand Poses Efficiently from Single Depth Images
详细信息    查看全文
  • 作者:Chi Xu ; Ashwin Nanjappa ; Xiaowei Zhang
  • 关键词:Hand pose estimation ; Depth images ; GPU acceleration ; Regression forests ; Consistency analysis ; Annotated hand image dataset
  • 刊名:International Journal of Computer Vision
  • 出版年:2016
  • 出版时间:January 2016
  • 年:2016
  • 卷:116
  • 期:1
  • 页码:21-45
  • 全文大小:4,777 KB
  • 参考文献:Andrews, H. C., & Patterson, C. L. (1976). Digital interpolation of discrete images. IEEE Transactions on Computers, C–25(2), 196–202.CrossRef
    Ballan, L., Taneja, A., Gall, J., Gool, L., & Pollefeys, M. (2012). Motion capture of hands in action using discriminative salient points. In ECCV.
    Biau, G., Devroye, L., & Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. Journal on Machine Learning Research, 9, 2015–2033.MathSciNet MATH
    Biau, G. (2012). Analysis of a random forests model. Journal on Machine Learning Research, 13, 1063–1095.MathSciNet MATH
    Breiman, L. (2004). Consistency for a simple random forests. Tech. rep. UC Berkeley.
    Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.CrossRef MATH
    Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 603–619.CrossRef
    Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In CVPR (Vol. 1, pp. 886–893).
    de La Gorce, M., Fleet, D., & Paragios, N. (2011). Model-based 3d hand pose estimation from monocular video. IEEE Transaction on Pattern Analysis and Machine, 33(9), 1793–1805.CrossRef
    Denil, M., Matheson, D., & de Freitas, N. (2014). Narrowing the gap: Random forests in theory and practice. In ICML.
    Erol, A., Bebis, G., Nicolescu, M., Boyle, R., & Twombly, X. (2007). Vision-based hand pose estimation: A review. Computer Vision Image Understanding, 108(1–2), 52–73.CrossRef
    Fanelli, G., Gall, J., & Gool, L. V. (2011). Real time head pose estimation with random regression forests. In CVPR.
    Gall, J., & Lempitsky, V. (2013). Class-specific hough forests for object detection. In Decision forests for computer vision and medical image analysis (pp. 143–157). Berlin: Springer.
    Girshick, R., Shotton, J., Kohli, P., Criminisi, A., & Fitzgibbon, A. (2011). Efficient regression of general-activity human poses from depth images. In ICCV.
    Gustus, A., Stillfried, G., Visser, J., Jorntell, H., & van der Smagt, P. (2012). Human hand modelling: Kinematics, dynamics, applications. Biological Cybernetics, 106(11–12), 741–755.MathSciNet CrossRef MATH
    Gyröfi, L., Kohler, M., Krzyzak, A., & Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Berlin: Springer.CrossRef
    Hackenberg, G., McCall, R., & Broll, W. (2011). Lightweight palm and finger tracking for real-time 3d gesture control. In IEEE virtual reality conference (pp. 19–26).
    Hamming, R. W. (1997). Digital filters (3rd ed.). Dover Publications.
    Hansard, M., Lee, S., Choi, O., & Horaud, R. (2013). Time-of-flight cameras: Principles, methods and applications. Berlin: Springer.CrossRef
    Hinterstoisser, S., Lepetit, V., Ilic, S., Fua, P., & Navab, N. (2010). Dominant orientation templates for real-time detection of textureless objects. In CVPR.
    Keskin, C., Kirac, F., Kara, Y., & Akarun, L. (2012). Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In ECCV.
    Kinect. (2011). http://​www.​xbox.​com/​en-US/​kinect/​ .
    Leapmotion. (2013). http://​www.​leapmotion.​com .
    Lewis, J. (1995). Fast normalized cross-correlation. In Vision interface (Vol. 10, pp. 120–123).
    Melax, S., Keselman, L., & Orsten, S. (2013). Dynamics based 3d skeletal hand tracking. In Graphics interface.
    Oikonomidis, N., & Argyros, A. (2011). Efficient model-based 3d tracking of hand articulations using kinect. In BMVC.
    Oikonomidis, I., Lourakis, M., & Argyros, A. (2014). Evolutionary quasi-random search for hand articulations tracking. In CVPR.
    Peachey, D. (1990). Texture on demand. Tech. rep.
    ShapeHand. (2009). http://​www.​shapehand.​com/​shapehand.​html .
    Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., & Blake, A. (2011). Real-time human pose recognition in parts from single depth images. In CVPR.
    Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., et al. (2013). Real-time human pose recognition in parts from single depth images. Communication of the ACM, 56(1), 116–124.CrossRef
    Softkinetic. (2012). http://​www.​softkinetic.​com .
    Sridhar, S., Oulasvirta, A., & Theobalt, C. (2013). Interactive markerless articulated hand motion tracking using rgb and depth data. In ICCV.
    Sueda, S., Kaufman, A., & Pai, D. (2008). Musculotendon simulation for hand animation. In SIGGRAPH (pp. 83:1–83:8).
    Tang, D., Tejani, A., Chang, H., & Kim, T. (2014) Latent regression forest: Structured estimation of 3d articulated hand posture. In CVPR.
    Taylor, J., Stebbing, R., Ramakrishna, V., Keskin, C., Shotton, J., Izadi, S., Fitzgibbon, A., & Hertzmann, A. (2014). User-specific hand modeling from monocular depth sequences. In CVPR.
    Tzionas, D., & Gall, J. (2013). A comparison of directional distances for hand pose estimation. In German conference on pattern recognition.
    Umeyama, S. (1991). Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13, 376380.CrossRef
    Wang, R., & Popović, J. (2009). Real-time hand-tracking with a color glove. In SIGGRAPH (pp. 63:1–63:8).
    Xu, C., & Cheng, L. (2013). Efficient hand pose estimation from a single depth image. In ICCV.
    Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., & Gall, J. (2013). Time-of-flight and depth imaging. Sensors, algorithms, and applications, chap. A survey on human motion analysis from depth data (pp. 149–187). Berlin: Springer.
    Zhao, W., Chai, J., & Xu, Y. (2012). Combining marker-based mocap and rgb-d camera for acquiring high-fidelity hand motion data. In Eurographics symposium on computer animation.
  • 作者单位:Chi Xu (1)
    Ashwin Nanjappa (1)
    Xiaowei Zhang (1)
    Li Cheng (1) (2)

    1. The Bioinformatics Institute, A*STAR, Singapore, Singapore
    2. School of Computing, National University of Singapore, Singapore, Singapore
  • 刊物类别:Computer Science
  • 刊物主题:Computer Imaging, Vision, Pattern Recognition and Graphics
    Artificial Intelligence and Robotics
    Image Processing and Computer Vision
    Pattern Recognition
  • 出版者:Springer Netherlands
  • ISSN:1573-1405
文摘
This paper aims to tackle the practically very challenging problem of efficient and accurate hand pose estimation from single depth images. A dedicated two-step regression forest pipeline is proposed: given an input hand depth image, step one involves mainly estimation of 3D location and in-plane rotation of the hand using a pixel-wise regression forest. This is utilized in step two which delivers final hand estimation by a similar regression forest model based on the entire hand image patch. Moreover, our estimation is guided by internally executing a 3D hand kinematic chain model. For an unseen test image, the kinematic model parameters are estimated by a proposed dynamically weighted scheme. As a combined effect of these proposed building blocks, our approach is able to deliver more precise estimation of hand poses. In practice, our approach works at 15.6 frame-per-second (FPS) on an average laptop when implemented in CPU, which is further sped-up to 67.2 FPS when running on GPU. In addition, we introduce and make publicly available a data-glove annotated depth image dataset covering various hand shapes and gestures, which enables us conducting quantitative analyses on real-world hand images. The effectiveness of our approach is verified empirically on both synthetic and the annotated real-world datasets for hand pose estimation, as well as related applications including part-based labeling and gesture classification. In addition to empirical studies, the consistency property of our approach is also theoretically analyzed. Keywords Hand pose estimation Depth images GPU acceleration Regression forests Consistency analysis Annotated hand image dataset

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700