用户名: 密码: 验证码:
Unsupervised Feature Learning for RGB-D Image Classification
详细信息    查看全文
  • 作者:I-Hong Jhuo (17)
    Shenghua Gao (18)
    Liansheng Zhuang (19)
    D. T. Lee (17) (20)
    Yi Ma (18) (21)

    17. Institute of Information Science
    ; Academia Sinca ; Taipei ; Taiwan
    18. School of Information Science and Technology
    ; ShanghaiTech University ; Shanghai ; China
    19. CAS Key Laboratory of Electromagnetic Space Information
    ; USTC ; Hefei ; China
    20. Department of Computer Science
    ; National Chung Hsing University ; Taichung ; Taiwan
    21. Department of ECE
    ; University of Illinois at Urbana-Champaign ; Champaign ; USA
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9003
  • 期:1
  • 页码:276-289
  • 全文大小:1,037 KB
  • 参考文献:1. Banerjee, J, Moelker, A, Niessen, WJ, Walsum, T 3D LBP-based rotationally invariant region description. In: Park, J-I, Kim, J eds. (2013) Computer Vision - ACCV 2012 Workshops. Springer, Heidelberg, pp. 26-37 CrossRef
    2. Bariya, P, Novatnack, J, Schwartz, G, Nishino, K (2012) 3D geometric scale variability in range images: features and descriptors. IJCV 99: pp. 232-255 CrossRef
    3. Blum, M., Springenberg, J., Wlfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D Data. In: ICRA (2012)
    4. Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: Desai, J.P., Dudek, G., Khatib, O., Kumar, V. (eds.) ISER 2012. STAR, vol. 88, pp. 387鈥?02. Springer, Heidelberg (2012)
    5. Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: CVPR (2012)
    6. Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: architecture and fast algorithms. In: NIPS (2011)
    7. Browatzki, B., Fischer, J., Graf, B., Blthoff, H.H., Wallraven, C.: Going into depth: evaluating. 2D and 3D cues for object classification on a new, large-scale object dataset. In: ICCV Workshop (2011)
    8. Frome, A, Huber, D, Kolluri, R, B眉low, T, Malik, J Recognizing objects in range data using regional point descriptors. In: Pajdla, T, Matas, JG eds. (2004) Computer Vision - ECCV 2004. Springer, Heidelberg, pp. 224-237 CrossRef
    9. Gupta, S., Arbelaez, P., Malik, J.: Perceptual organization and recognition of indoor scenes from RGBD images. In: CVPR (2013)
    10. Hinton, GE, Osindero, S, Teh, Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput. 18: pp. 1527-1554 CrossRef
    11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
    12. Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley Interscience, New York (2001)
    13. Janoch, A., Karayev, S., Jia, Y., Barron, J.T., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-D object dataset: putting the kinect to work. In: ICCV Workshop (2011)
    14. Jarrett, K., Kavukcuoglu, K., Ranzato, M.A., LeCun. Y.: What is the best multi-stage architecture for object recognition? In: ICCV (2009)
    15. Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: ICRA (2011)
    16. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
    17. Le, Q.V., Karpenko, A., Ngiam, J., Ng, A.Y.: ICA with reconstruction cost for efficient overcomplete feature learning. In: NIPS (2011)
    18. Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. In: ICML (2012)
    19. Le, Q.V., Ngiam, J., Chen, Z., Chia, D., Koh, P., Ng, A.Y.: Tiled convolutional neural networks. In: NIPS (2010)
    20. LeCun, Y, Boser, B, Denker, JS, Henderson, D, Howard, RE, Hubbard, W, Jackel, LD (1989) Backpropagation applied to handwritten zip code rsecognition. Neural Comput. 1: pp. 541-551 CrossRef
    21. Lyu, S., Simoncelli, E.: Nonlinear image representation using divisive normalization. In: ICCV (2009)
    22. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML (2011)
    23. Ren, X., Bo, L., Fox, D.: RGB-(D) scene labeling: features and algorithms. In: CVPR (2012)
    24. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: ICML (2011)
    25. Socher, R., Huval, B., Bhat, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: NIPS (2012)
    26. Silberman, N., Fergus R.: Indoor scene segmentation using a structured light sensor. In: ICCV Workshop (2011)
    27. Silberman, N, Hoiem, D, Kohli, P, Fergus, R Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A, Lazebnik, S, Perona, P, Sato, Y, Schmid, C eds. (2012) Computer Vision 鈥?ECCV 2012. Springer, Heidelberg, pp. 746-760 CrossRef
    28. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: NIPS (2012)
    29. Vincent, P, Larochelle, H, Lajoie, I, Bengio, Y, Manzagol, P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11: pp. 3371-3408
    30. Wang, N., Yeung, D.-Y.: Learning a deep compact image representation for visual tracking. In: NIPS (2013)
    31. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
    32. Zeiler, M., Krishnan, D., Taylor, G., Fergus, R.: Deconvolutional networks. In: CVPR (2010)
  • 作者单位:Computer Vision -- ACCV 2014
  • 丛书名:978-3-319-16864-7
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
Motivated by the success of Deep Neural Networks in computer vision, we propose a deep Regularized Reconstruction Independent Component Analysis network (R \(^2\) ICA) for RGB-D image classification. In each layer of this network, we include a R \(^2\) ICA as the basic building block to determine the relationship between the gray-scale and depth images corresponding to the same object or scene. Implementing commonly used local contrast normalization and spatial pooling, we gradually enhance our network to be resilient to local variance resulting in a robust image representation for RGB-D image classification. Moreover, compared with conventional handcrafted feature-based RGB-D image representation, the proposed deep R \(^2\) ICA is a feedforward network. Hence, it is more efficient for image representation. Experimental results on three publicly available RGB-D datasets demonstrate that the proposed method consistently outperforms the state-of-the-art conventional, manually designed RGB-D image representation confirming its effectiveness for RGB-D image classification.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700