用户名: 密码: 验证码:
Video-based Human Action Recognition.
详细信息   
  • 作者:Wang ; Jiang.
  • 学历:Doctor
  • 年:2014
  • 毕业院校:Northwestern University
  • Department:Electrical and Computer Engineering.
  • ISBN:9781321079623
  • CBH:3630191
  • Country:USA
  • 语种:English
  • FileSize:7377401
  • Pages:188
文摘
Video-based human action recognition enables computers to understand human behaviors purely from visual data. There have been active and fruitful research for decades on this topic because of its wide applications,such as video surveillance,human computer interface,and video retrieval. However,video-based human action recognition remains to be a challenging task. Human actions usually consist of high-dimensional complex spatio-temporal patterns including human-human and human-object interactions. The spatio-temporal patterns of human actions have large variations,because an action can be performed at different pace or in different order,and it can be captured from different viewpoints. In order to model complex spatio-temporal patterns,we do not only need to develop effective low-level visual representations to compactly characterize the high-dimensional spatio-temporal patterns,but also require sophisticated data mining and machine learning algorithms to find the discriminative spatio-temporal patterns and learn meaningful statistical models to understand the semantic meanings of these spatio-temporal patterns. In this thesis,we work on three main themes towards solving these challenges: visual representation,pattern mining,and discriminative pattern learning. We propose a novel spatio-temporal contextual representation for human actions,which is compact but at the same time capable of characterizing human-object interactions. Visual representations are also sensor-dependent. The recently developed depth cameras,which largely extend the capability of the computer vision systems,require completely different visual representation. We develop visual representations for depth cameras to fully exploit its power for human action recognition. We also propose discriminative data mining method to discover compact spatio-temporal patterns called Actionlet. The Actionlet ensemble model does not only characterizes the human actions with good generalization ability,but also is robust to occlusions and viewpoint changes. Mining Actionlet ensemble is performed in the depth camera sequences,but with the development of Multiview Spatio-Temporal AND-OR MST-AOG) model,we can detect Actionlet in regular videos from any viewpoint so that only regular videos are required in testing. Finally,the maximum margin dynamic temporal warping is proposed to model the temporal structures of human actions. The maximum margin dynamic temporal warping discriminatively learns action template and performs temporal alignment at the same time,by representing an action with a set of phantom action templates,which consist of a sequence of discriminatively learned atomic actions. Extensive experiments on real-world action recognition dataset demonstrate excellent results of the proposed action recognition algorithms.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700