Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

Augmenting bag-of-words: a robust contextual representation of spatiotemporal interest points for action recognition

详细信息查看全文

作者：Yang Li ; Junyong Ye ; Tongqing Wang ; Shijian Huang
关键词：Action recognition ; Contextual features ; Cumulative probability histogram ; Sparse coding
刊名：The Visual Computer
出版年：2015
出版时间：October 2015
年：2015
卷：31
期：10
页码：1383-1394
全文大小：2,392 KB
参考文献：1.Laptev, I.: On space-time interest points. Int. J. Comput. Vis. 64(2鈥?), 107鈥?23 (2005)
2.Niebles, J.C., Wang, H.C., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. Int. J. Comput. Vis. 79(3), 299鈥?18 (2008)CrossRef
3.Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3222鈥?229 (2008)
4.Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65鈥?2 (2005)
5.Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Proceedings of British Machine Vision Conference, pp. 1鈥?1 (2009)
6.Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: Proceedings of British Machine Vision Conference, pp. 1鈥?0 (2008)
7.Ryoo, M.S., Aggarwal, J.K.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1593鈥?600 (2009)
8.Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2046鈥?053 (2010)
9.Bilinski, P., Bremond, F.: Contextual statistics of space-time ordered features for human action recognition. In: Proceedings of 9th IEEE International Conference on Advanced Video and Signal-Based Surveillance, pp. 228鈥?33 (2012)
10.Liu, J., Yang, Y., Saleemi, I., Shah, M.: Learning semantic features for action recognition via diffusion maps. Comput. Vis. Image Underst. 116(3), 361鈥?77 (2012)CrossRef
11.Savarese, S., DelPozo, A., Niebles, J.C., Fei-Fei, L.: Spatial-temporal correlatons for unsupervised action classification. In: IEEE Workshop on Motion and Video Computing, pp. 1鈥? (2008)
12.Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3169鈥?176 (2011)
13.Wang, C., Wang, Y., Yuille, A.L.: An approach to pose-based action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 915鈥?22 (2013)
14.Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247鈥?253 (2007)CrossRef
15.Messing, R., Pal, C., Kautz, H.: Activity recognition using the velocity histories of tracked keypoints. In: Proceedings of IEEE International Conference on Computer Vision, pp. 104鈥?11 (2009)
16.Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2004鈥?011 (2009)
17.Ahad, M.A.R., Tan, J.K., Kim, H., Ishikawa, S.: Motion history image: its variants and applications. Mach. Vis. Appl. 23(2), 255鈥?81 (2012)CrossRef
18.Choi, J., Jeon, W.J., Lee, S.C.: Spatio-temporal pyramid matching for sports videos. In: Proceedings of 1st ACM International Conference on Multimedia Information Retrieval, pp. 291鈥?97 (2008)
19.Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2169鈥?178 (2006)
20.Zhao, D., Shao, L., Zhen, X., Liu, Y.: Combining appearance and structural features for human action recognition. Neurocomputing 113, 88鈥?6 (2013)CrossRef
21.Wu, X., Xu, D., Duan, L., Luo, J.: Action recognition using context and appearance distribution features. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 489鈥?96 (2011)
22.Yuan, C., Li, X., Hu, W., Ling, H., Maybank, S.: 3D r transform on spatio-temporal interest points for action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 724鈥?30 (2013)
23.Bregonzio, M., Xiang, T., Gong, S.: Fusing appearance and distribution information of interest points for action recognition. Pattern Recognit. 45(3), 1220鈥?234 (2012)CrossRef
24.Bregonzio, M., Gong, S., Xiang, T.: Recognising action as clouds of space-time interest points. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1948鈥?955 (2009)
25.Olshausen, B.A., Field, D.J.: Sparse coding with an overcomplete basis set: a strategy employed by v1? Vis. Res. 37(23), 3311鈥?325 (1997)CrossRef
26.Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 31(2), 210鈥?27 (2009)CrossRef
27.Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3360鈥?367 (2010)
28.Van Gemert, J.C., Veenman, C.J., Smeulders, A.W.M., Geusebroek, J.M.: Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1271鈥?283 (2010)CrossRef
29.Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2486鈥?493 (2011)
30.Sun, X., Chen, M., Hauptmann, A.: Action recognition via local descriptors and holistic features. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 58鈥?5 (2009)
31.Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of IEEE International Conference on Pattern Recognition, pp. 32鈥?6 (2004)
32.Wang, J., Chen, Z., Wu, Y.: Action recognition with multiscale spatio-temporal contexts. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3185鈥?192 (2011)
33.Satkin, S., Hebert, M.: Modeling the temporal extent of actions. In: Proceedings of European Conference on Computer Vision, pp. 536鈥?48 (2010)
34.Raptis, M., Soatto, S.: Tracklet descriptors for action modeling and video analysis. In: Proceedings of European Conference on Computer Vision, pp. 577鈥?90 (2010)
35.Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach: a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1鈥? (2008)
36.Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre T.: HMDB: a large video database for human motion recognition. In: Proceedings of IEEE International Conference on Computer Vision, pp. 2556鈥?563 (2011)
37.Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proceedings of IEEE International Conference on Computer Vision, pp. 1鈥? (2007)
38.Kliper-Gross, O., Gurovich, Y., Hassner, T., Wolf, L.: Motion interchange patterns for action recognition in unconstrained videos. In: Proceedings of European Conference on Computer Vision, pp. 256鈥?69 (2012)
39.Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of IEEE International Conference on Computer Vision, pp. 3551鈥?558 (2013)
40.Jiang, Y.G., Dai, Q., Xue, X., Liu, W., Ngo, C.W.: Trajectory-based modeling of human actions with motion reference points. In: Proceedings of European Conference on Computer Vision, pp. 425鈥?38 (2012)
41.Jain, M., Jegou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2555鈥?562 (2013)
作者单位：Yang Li (1)
Junyong Ye (1)
Tongqing Wang (1)
Shijian Huang (1)

1. Key Laboratory of Optoelectronic Technology and Systems of the Ministry of Education, Chongqing University, Chongqing, China
刊物类别：Computer Science
刊物主题：Computer Graphics
Computer Science, general
Artificial Intelligence and Robotics
Image Processing and Computer Vision
出版者：Springer Berlin / Heidelberg
ISSN：1432-2315

文摘

Although traditional bag-of-words model, together with local spatiotemporal features, has shown promising results for human action recognition, it ignores all structural information of features, which carries important information of motion structures in videos. Recent methods usually characterize the relationship of quantized spatiotemporal features to overcome this drawback. However, the propagation of quantization error leads to an unreliable representation. To alleviate the propagation of quantization error, we present a coding method, which considers not only the spatial similarity but also the reconstruction ability of visual words after giving a probabilistic interpretation of coding coefficients. Based on our coding method, a new type of feature called cumulative probability histogram is proposed to robustly characterize contextual structural information around interest points, which are extracted from multi-layered contexts and assumed to be complementary to local spatiotemporal features. The proposed method is verified on four benchmark datasets. Experiment results show that our method can achieve better performance than previous methods in action recognition. Keywords Action recognition Contextual features Cumulative probability histogram Sparse coding

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700