1 norm to select key evidences while using the Infinite Push Loss Function to enforce positive videos to have higher detection scores than negative videos. The Alternating Direction Method of Multipliers (ADMM) algorithm is used to solve the optimization problem. Experiments on large-scale video datasets show that our method can improve the detection accuracy while providing the unique capability in discovering key evidences of each complex event." />
用户名: 密码: 验证码:
Recognizing Complex Events in Videos by Learning Key Static-Dynamic Evidences
详细信息    查看全文
  • 作者:Kuan-Ting Lai (19) (20)
    Dong Liu (21)
    Ming-Syan Chen (19) (20)
    Shih-Fu Chang (21)
  • 关键词:Video Event Detection ; Infinite Push ; Key Evidence Selection ; ADMM
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2014
  • 出版时间:2014
  • 年:2014
  • 卷:8691
  • 期:1
  • 页码:675-688
  • 全文大小:1,609 KB
  • 参考文献:1. Agarwal, S.: The infinite push: A new support vector ranking algorithm that directly optimizes accuracy at the absolute top of the list. In: SDM, pp. 839-50. Society for Industrial and Applied Mathematics (2011)
    2. Bhattacharya, S., Yu, F.X., Chang, S.F.: Minimally needed evidence for complex event recognition in unconstrained videos. In: ICMR (2014)
    3. Cao, L., Mu, Y., Natsev, A., Chang, S.-F., Hua, G., Smith, J.R.: Scene aligned pooling for complex video recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol.?7573, pp. 688-01. Springer, Heidelberg (2012) CrossRef
    4. Chen, Y., Bi, J., Wang, J.Z.: Miles: Multiple-instance learning via embedded instance selection. PAMI?28(12), 1931-947 (2006) CrossRef
    5. Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence?89(1), 31-1 (1997) CrossRef
    6. Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: Combining multiple features for human action recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol.?6311, pp. 494-07. Springer, Heidelberg (2010) CrossRef
    7. INRIA: Yael library: Optimized implementations of computationally demanding functions (2009), https://gforge.inria.fr/projects/yael/
    8. Jiang, Y.G., Bhattacharya, S., Chang, S.F., Shah, M.: High-level event recognition in unconstrained videos. IJMIR, 1-9 (2012)
    9. Joachims, T.: Optimizing search engines using clickthrough data. In: SIGKDD, pp. 133-42. ACM (2002)
    10. Li, W., Yu, Q., Divakaran, A., Vasconcelos, N.: Dynamic pooling for complex event recognition. In: ICCV (2013)
    11. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV?60(2), 91-10 (2004) CrossRef
    12. Natarajan, P., Wu, S., Vitaladevuni, S., Zhuang, X., Tsakalidis, S., Park, U., Prasad, R.: Multimodal feature fusion for robust event detection in web videos. In: CVPR (2012)
    13. Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol.?6312, pp. 392-05. Springer, Heidelberg (2010) CrossRef
    14. Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: ICCV, pp. 1817-824 (2013)
    15. Over, P., Awad, G., Michel, M., Fiscus, J., Sanders, G., Kraaij, W., Smeaton, A.F., Quenot, G.: Trecvid 2013 -an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: Proceedings of TRECVID 2013. NIST (2013)
    16. Quattoni, A., Carreras, X., Collins, M., Darrell, T.: An efficient projection for / l 1,?∞-/sub>, infinity regularization. In: ICML (2009)
    17. Rakotomamonjy, A.: Sparse support vector infinite push. In: ICML (2012)
    18. Rudin, C.: The p-norm push: A simple convex ranking algorithm that concentrates at the top of the list. JMLR?10, 2233-271 (2009)
    19. Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild. CRCV-TR-12-01 (2012)
    20. Tamrakar, A., Ali, S., Yu, Q., Liu, J., Javed, O., Divakaran, A., Cheng, H., Sawhney, H.: Evaluation of low-level features and their combinations for complex event detection in open source videos. In: CVPR (2012)
    21. Tang, K., Fei-Fei, L., Koller, D.: Learning latent temporal structure for complex event detection. In: CVPR (2012)
    22. Vahdat, A., Cannons, K., Mori, G., Oh, S., Kim, I.: Compositional models for video event detection: A multiple kernel learning latent variable approach. In: ICCV, pp. 1185-192 (2013)
    23. Vedaldi, A., Fulkerson, B.: Vlfeat: An open and portable library of computer vision algorithms (2008), http://www.vlfeat.org/
    24. Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
    25. Wang, H., Kl?ser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. IJCV, 1-0 (2013)
  • 作者单位:Kuan-Ting Lai (19) (20)
    Dong Liu (21)
    Ming-Syan Chen (19) (20)
    Shih-Fu Chang (21)

    19. Graduate Institute of Electrical Engineering, National Taiwan University, Taiwan
    20. Research Center for IT Innovation, Academia Sinica, Taiwan
    21. Department of Electrical Engineering, Columbia University, USA
  • ISSN:1611-3349
文摘
Complex events consist of various human interactions with different objects in diverse environments. The evidences needed to recognize events may occur in short time periods with variable lengths and can happen anywhere in a video. This fact prevents conventional machine learning algorithms from effectively recognizing the events. In this paper, we propose a novel method that can automatically identify the key evidences in videos for detecting complex events. Both static instances (objects) and dynamic instances (actions) are considered by sampling frames and temporal segments respectively. To compare the characteristic power of heterogeneous instances, we embed static and dynamic instances into a multiple instance learning framework via instance similarity measures, and cast the problem as an Evidence Selective Ranking (ESR) process. We impose ?sub class="a-plus-plus">1 norm to select key evidences while using the Infinite Push Loss Function to enforce positive videos to have higher detection scores than negative videos. The Alternating Direction Method of Multipliers (ADMM) algorithm is used to solve the optimization problem. Experiments on large-scale video datasets show that our method can improve the detection accuracy while providing the unique capability in discovering key evidences of each complex event.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700