Video event recognition and prediction based on temporal structure analysis.

设为首页

收藏本站

网站地图 | English | 公务邮箱

读者指南

学术客户端

NSTL服务站

科技查新

Video event recognition and prediction based on temporal structure analysis.

详细信息

作者：Li ; Kang.
学历：Doctor
年：2015
毕业院校：Northeastern University
Department：Electrical and Computer Engineering.
ISBN：9781321426656
CBH：3668030
Country：USA
语种：English
FileSize：10864285
Pages：181

文摘

The increasing ubiquitousness of multimedia information in todays world has positioned video as a favored information vehicle,and given rise to an astonishing generation of social media and surveillance footage. Consumer-grade video is becoming abundant on the Internet,and it is now easier than ever to download multimedia material of any kind and quality. This raises a series of technological demands for automatic video understanding,which has motivated the research community to guide its steps towards a better attainment of such capabilities. As a result,current trends on cognitive vision promise to recognize complex events and self-adapt to different environments,while managing and integrating several types of knowledge. One important problem that will significantly enhance semantic-level video analysis is activity and event understanding,which aims at accurately describing video contents using key semantic elements,such activities and events. One well-known challenge is the long-standing semantic gap between computable low-level features and semantic information that they encode. In this thesis,several studies of high-level video content understanding were presented,which address these difficulties and narrow the semantic gap effectively. In particular,we have focused on two types of videos,namely human activity video and unconstrained consumer video. The proposed temporal structure analysis frameworks significantly extend the domains of video that can be understood by machine vision systems. In aspect of human activity recognition,we notice that in case a time-critical decision is needed,there is no work that utilizes the temporal structure of videos for early prediction of ongoing human activity. Thus we present a general activity prediction framework in which human activities can be characterized by a complex temporal composition of constituent simple actions and interacting objects. Then we extend our work to the 3D cases of action prediction motivated by recent advent of the cost-effective sensors,such as depth camera Kinect. By considering 3D action data as multivariate time series m.t.s.) synchronized to a shared common clock frames),we proposed a stochastic process called Marked Point Process MPP) modelling the 3D action as temporal dynamic patterns,where both timing and strength information are captured. In aspect of unconstrained consumer video understanding,we also focus on the temporal structure of the video content through a semantic-segment based design,in which each video clip can be represented as a series of varying videography words. Then,unique videography signatures from different events can be automatically identified,using statistical analysis methods. We explore the use of videography analysis for different types of applications,including content-based video retrieval,video summarization both visual and textual),videography based feature pooling.

常见问题　|　交通位置　|　联系我们　|　OA远程办公

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700