The Study of Video Anomaly Detection and Localization

博士 === 國立臺灣科技大學 === 電子工程系 === 105 === This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecti...

Full description

Bibliographic Details
Main Authors: Kai-Wen Cheng, 鄭凱文
Other Authors: Yie-Tarng Chen
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/41267686390587246942
id ndltd-TW-105NTUS5428004
record_format oai_dc
spelling ndltd-TW-105NTUS54280042017-03-31T04:39:19Z http://ndltd.ncl.edu.tw/handle/41267686390587246942 The Study of Video Anomaly Detection and Localization 影像異常行為的偵測與定位之研究 Kai-Wen Cheng 鄭凱文 博士 國立臺灣科技大學 電子工程系 105 This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecting local anomalies, which refer to video events with unusual appearances or motions, we are more interested in global anomalies that involve multiple video events interacting in an unusual manner, even if any individual video event can be normal. To simultaneously detect local and global anomalies, we first introduce a hierarchical feature structure for video event representation. Then, a statistical model is built to understand the normal events in a training set which does not contain any anomalies, based on which a tree-based inference algorithm is developed to detect and locate abnormal events in unseen-before test videos. Along the same structure, we gradually enrich our feature structures, statistical models, and inference algorithms to increasingly improve our previous methods. In this dissertation, we investigate two different hierarchical feature representations: 1) the bag-of-words histogram (BOW) and 2) the {\it ensemble} of nearby spatio-temporal interest points (STIP); two different kernel-based statistical models: 1) one-class support vector machine (SVM) and 2) Gaussian process regression (GPR); and two different inference algorithms: 1) single-instance path search and 2) multiple-instance path search (MiPS). Simulations on five popular benchmarks show that the proposed methods significantly outperform the main state-of-the-art methods, yet with lower computation time. We also demonstrate that such a framework can be successfully applied to improve many convolution neural network (CNN) based object recognition methods. This is achieved by developing an iterative localization refinement (ILR) algorithm as a post-processing scheme to refine these object detection results in an iterative manner in order to match as much ground-truth as possible. Simulations show that the proposed method can improve the main state-of-the-art works on the large-scale PASCAL VOC 2007, 2012, and Youtube-Object datasets. Yie-Tarng Chen 陳郁堂 2016 學位論文 ; thesis 161 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣科技大學 === 電子工程系 === 105 === This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecting local anomalies, which refer to video events with unusual appearances or motions, we are more interested in global anomalies that involve multiple video events interacting in an unusual manner, even if any individual video event can be normal. To simultaneously detect local and global anomalies, we first introduce a hierarchical feature structure for video event representation. Then, a statistical model is built to understand the normal events in a training set which does not contain any anomalies, based on which a tree-based inference algorithm is developed to detect and locate abnormal events in unseen-before test videos. Along the same structure, we gradually enrich our feature structures, statistical models, and inference algorithms to increasingly improve our previous methods. In this dissertation, we investigate two different hierarchical feature representations: 1) the bag-of-words histogram (BOW) and 2) the {\it ensemble} of nearby spatio-temporal interest points (STIP); two different kernel-based statistical models: 1) one-class support vector machine (SVM) and 2) Gaussian process regression (GPR); and two different inference algorithms: 1) single-instance path search and 2) multiple-instance path search (MiPS). Simulations on five popular benchmarks show that the proposed methods significantly outperform the main state-of-the-art methods, yet with lower computation time. We also demonstrate that such a framework can be successfully applied to improve many convolution neural network (CNN) based object recognition methods. This is achieved by developing an iterative localization refinement (ILR) algorithm as a post-processing scheme to refine these object detection results in an iterative manner in order to match as much ground-truth as possible. Simulations show that the proposed method can improve the main state-of-the-art works on the large-scale PASCAL VOC 2007, 2012, and Youtube-Object datasets.
author2 Yie-Tarng Chen
author_facet Yie-Tarng Chen
Kai-Wen Cheng
鄭凱文
author Kai-Wen Cheng
鄭凱文
spellingShingle Kai-Wen Cheng
鄭凱文
The Study of Video Anomaly Detection and Localization
author_sort Kai-Wen Cheng
title The Study of Video Anomaly Detection and Localization
title_short The Study of Video Anomaly Detection and Localization
title_full The Study of Video Anomaly Detection and Localization
title_fullStr The Study of Video Anomaly Detection and Localization
title_full_unstemmed The Study of Video Anomaly Detection and Localization
title_sort study of video anomaly detection and localization
publishDate 2016
url http://ndltd.ncl.edu.tw/handle/41267686390587246942
work_keys_str_mv AT kaiwencheng thestudyofvideoanomalydetectionandlocalization
AT zhèngkǎiwén thestudyofvideoanomalydetectionandlocalization
AT kaiwencheng yǐngxiàngyìchángxíngwèidezhēncèyǔdìngwèizhīyánjiū
AT zhèngkǎiwén yǐngxiàngyìchángxíngwèidezhēncèyǔdìngwèizhīyánjiū
AT kaiwencheng studyofvideoanomalydetectionandlocalization
AT zhèngkǎiwén studyofvideoanomalydetectionandlocalization
_version_ 1718435855659433984