Unsupervised Feature Learning for RGB-D Image Classification

设为首页

收藏本站

网站地图 | English | 公务邮箱

NSTL服务站

详细信息查看全文

作者：I-Hong Jhuo (17)
Shenghua Gao (18)
Liansheng Zhuang (19)
D. T. Lee (17) (20)
Yi Ma (18) (21)

17. Institute of Information Science ; Academia Sinca ; Taipei ; Taiwan
18. School of Information Science and Technology ; ShanghaiTech University ; Shanghai ; China
19. CAS Key Laboratory of Electromagnetic Space Information ; USTC ; Hefei ; China
20. Department of Computer Science ; National Chung Hsing University ; Taichung ; Taiwan
21. Department of ECE ; University of Illinois at Urbana-Champaign ; Champaign ; USA
刊名：Lecture Notes in Computer Science
出版年：2015
出版时间：2015
年：2015
卷：9003
期：1
页码：276-289
全文大小：1,037 KB
参考文献：1. Banerjee, J, Moelker, A, Niessen, WJ, Walsum, T 3D LBP-based rotationally invariant region description. In: Park, J-I, Kim, J eds. (2013) Computer Vision - ACCV 2012 Workshops. Springer, Heidelberg, pp. 26-37 CrossRef
2. Bariya, P, Novatnack, J, Schwartz, G, Nishino, K (2012) 3D geometric scale variability in range images: features and descriptors. IJCV 99: pp. 232-255 CrossRef
3. Blum, M., Springenberg, J., Wlfing, J., Riedmiller, M.: A learned feature descriptor for object recognition in RGB-D Data. In: ICRA (2012)
4. Bo, L., Ren, X., Fox, D.: Unsupervised feature learning for RGB-D based object recognition. In: Desai, J.P., Dudek, G., Khatib, O., Kumar, V. (eds.) ISER 2012. STAR, vol. 88, pp. 387鈥?02. Springer, Heidelberg (2012)
5. Bo, L., Lai, K., Ren, X., Fox, D.: Object recognition with hierarchical kernel descriptors. In: CVPR (2012)
6. Bo, L., Ren, X., Fox, D.: Hierarchical matching pursuit for image classification: architecture and fast algorithms. In: NIPS (2011)
7. Browatzki, B., Fischer, J., Graf, B., Blthoff, H.H., Wallraven, C.: Going into depth: evaluating. 2D and 3D cues for object classification on a new, large-scale object dataset. In: ICCV Workshop (2011)
8. Frome, A, Huber, D, Kolluri, R, B眉low, T, Malik, J Recognizing objects in range data using regional point descriptors. In: Pajdla, T, Matas, JG eds. (2004) Computer Vision - ECCV 2004. Springer, Heidelberg, pp. 224-237 CrossRef
9. Gupta, S., Arbelaez, P., Malik, J.: Perceptual organization and recognition of indoor scenes from RGBD images. In: CVPR (2013)
10. Hinton, GE, Osindero, S, Teh, Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput. 18: pp. 1527-1554 CrossRef
11. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
12. Hyvarinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. Wiley Interscience, New York (2001)
13. Janoch, A., Karayev, S., Jia, Y., Barron, J.T., Fritz, M., Saenko, K., Darrell, T.: A category-level 3-D object dataset: putting the kinect to work. In: ICCV Workshop (2011)
14. Jarrett, K., Kavukcuoglu, K., Ranzato, M.A., LeCun. Y.: What is the best multi-stage architecture for object recognition? In: ICCV (2009)
15. Lai, K., Bo, L., Ren, X., Fox, D.: A large-scale hierarchical multi-view RGB-D object dataset. In: ICRA (2011)
16. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: CVPR (2006)
17. Le, Q.V., Karpenko, A., Ngiam, J., Ng, A.Y.: ICA with reconstruction cost for efficient overcomplete feature learning. In: NIPS (2011)
18. Le, Q.V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., Ng, A.Y.: Building high-level features using large scale unsupervised learning. In: ICML (2012)
19. Le, Q.V., Ngiam, J., Chen, Z., Chia, D., Koh, P., Ng, A.Y.: Tiled convolutional neural networks. In: NIPS (2010)
20. LeCun, Y, Boser, B, Denker, JS, Henderson, D, Howard, RE, Hubbard, W, Jackel, LD (1989) Backpropagation applied to handwritten zip code rsecognition. Neural Comput. 1: pp. 541-551 CrossRef
21. Lyu, S., Simoncelli, E.: Nonlinear image representation using divisive normalization. In: ICCV (2009)
22. Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: ICML (2011)
23. Ren, X., Bo, L., Fox, D.: RGB-(D) scene labeling: features and algorithms. In: CVPR (2012)
24. Rifai, S., Vincent, P., Muller, X., Glorot, X., Bengio, Y.: Contractive auto-encoders: explicit invariance during feature extraction. In: ICML (2011)
25. Socher, R., Huval, B., Bhat, B., Manning, C.D., Ng, A.Y.: Convolutional-recursive deep learning for 3D object classification. In: NIPS (2012)
26. Silberman, N., Fergus R.: Indoor scene segmentation using a structured light sensor. In: ICCV Workshop (2011)
27. Silberman, N, Hoiem, D, Kohli, P, Fergus, R Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A, Lazebnik, S, Perona, P, Sato, Y, Schmid, C eds. (2012) Computer Vision 鈥?ECCV 2012. Springer, Heidelberg, pp. 746-760 CrossRef
28. Srivastava, N., Salakhutdinov, R.: Multimodal learning with deep boltzmann machines. In: NIPS (2012)
29. Vincent, P, Larochelle, H, Lajoie, I, Bengio, Y, Manzagol, P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11: pp. 3371-3408
30. Wang, N., Yeung, D.-Y.: Learning a deep compact image representation for visual tracking. In: NIPS (2013)
31. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR (2009)
32. Zeiler, M., Krishnan, D., Taylor, G., Fergus, R.: Deconvolutional networks. In: CVPR (2010)
作者单位：Computer Vision -- ACCV 2014
丛书名：978-3-319-16864-7
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349

文摘

Motivated by the success of Deep Neural Networks in computer vision, we propose a deep Regularized Reconstruction Independent Component Analysis network (R \(^2\) ICA) for RGB-D image classification. In each layer of this network, we include a R \(^2\) ICA as the basic building block to determine the relationship between the gray-scale and depth images corresponding to the same object or scene. Implementing commonly used local contrast normalization and spatial pooling, we gradually enhance our network to be resilient to local variance resulting in a robust image representation for RGB-D image classification. Moreover, compared with conventional handcrafted feature-based RGB-D image representation, the proposed deep R \(^2\) ICA is a feedforward network. Hence, it is more efficient for image representation. Experimental results on three publicly available RGB-D datasets demonstrate that the proposed method consistently outperforms the state-of-the-art conventional, manually designed RGB-D image representation confirming its effectiveness for RGB-D image classification.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700