The Study of Video Anomaly Detection and Localization
博士 === 國立臺灣科技大學 === 電子工程系 === 105 === This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecti...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/41267686390587246942 |
id |
ndltd-TW-105NTUS5428004 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105NTUS54280042017-03-31T04:39:19Z http://ndltd.ncl.edu.tw/handle/41267686390587246942 The Study of Video Anomaly Detection and Localization 影像異常行為的偵測與定位之研究 Kai-Wen Cheng 鄭凱文 博士 國立臺灣科技大學 電子工程系 105 This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecting local anomalies, which refer to video events with unusual appearances or motions, we are more interested in global anomalies that involve multiple video events interacting in an unusual manner, even if any individual video event can be normal. To simultaneously detect local and global anomalies, we first introduce a hierarchical feature structure for video event representation. Then, a statistical model is built to understand the normal events in a training set which does not contain any anomalies, based on which a tree-based inference algorithm is developed to detect and locate abnormal events in unseen-before test videos. Along the same structure, we gradually enrich our feature structures, statistical models, and inference algorithms to increasingly improve our previous methods. In this dissertation, we investigate two different hierarchical feature representations: 1) the bag-of-words histogram (BOW) and 2) the {\it ensemble} of nearby spatio-temporal interest points (STIP); two different kernel-based statistical models: 1) one-class support vector machine (SVM) and 2) Gaussian process regression (GPR); and two different inference algorithms: 1) single-instance path search and 2) multiple-instance path search (MiPS). Simulations on five popular benchmarks show that the proposed methods significantly outperform the main state-of-the-art methods, yet with lower computation time. We also demonstrate that such a framework can be successfully applied to improve many convolution neural network (CNN) based object recognition methods. This is achieved by developing an iterative localization refinement (ILR) algorithm as a post-processing scheme to refine these object detection results in an iterative manner in order to match as much ground-truth as possible. Simulations show that the proposed method can improve the main state-of-the-art works on the large-scale PASCAL VOC 2007, 2012, and Youtube-Object datasets. Yie-Tarng Chen 陳郁堂 2016 學位論文 ; thesis 161 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立臺灣科技大學 === 電子工程系 === 105 === This dissertation presents a unified framework for video anomaly detection and localization via hierarchical feature representations, kernel-based statistical models, and tree-based search algorithms. While most research on this topic has focused more on detecting local anomalies, which refer to video events with unusual appearances or motions, we are more interested in global anomalies that involve multiple video events interacting in an unusual manner, even if any individual video event can be normal. To simultaneously detect local and global anomalies, we first introduce a hierarchical feature structure for video event representation. Then, a statistical model is built to understand the normal events in a training set which does not contain any anomalies, based on which a tree-based inference algorithm is developed to detect and locate abnormal events in unseen-before test videos.
Along the same structure, we gradually enrich our feature structures, statistical models, and inference algorithms to increasingly improve our previous methods. In this dissertation, we investigate two different hierarchical feature representations: 1) the bag-of-words histogram (BOW) and 2) the {\it ensemble} of nearby spatio-temporal interest points (STIP); two different kernel-based statistical models: 1) one-class support vector machine (SVM) and 2) Gaussian process regression (GPR); and two different inference algorithms: 1) single-instance path search and 2) multiple-instance path search (MiPS). Simulations on five popular benchmarks show that the proposed methods significantly outperform the main state-of-the-art methods, yet with lower computation time.
We also demonstrate that such a framework can be successfully applied to improve many convolution neural network (CNN) based object recognition methods. This is achieved by developing an iterative localization refinement (ILR) algorithm as a post-processing scheme to refine these object detection results in an iterative manner in order to match as much ground-truth as possible. Simulations show that the proposed method can improve the main state-of-the-art works on the large-scale PASCAL VOC 2007, 2012, and Youtube-Object datasets.
|
author2 |
Yie-Tarng Chen |
author_facet |
Yie-Tarng Chen Kai-Wen Cheng 鄭凱文 |
author |
Kai-Wen Cheng 鄭凱文 |
spellingShingle |
Kai-Wen Cheng 鄭凱文 The Study of Video Anomaly Detection and Localization |
author_sort |
Kai-Wen Cheng |
title |
The Study of Video Anomaly Detection and Localization |
title_short |
The Study of Video Anomaly Detection and Localization |
title_full |
The Study of Video Anomaly Detection and Localization |
title_fullStr |
The Study of Video Anomaly Detection and Localization |
title_full_unstemmed |
The Study of Video Anomaly Detection and Localization |
title_sort |
study of video anomaly detection and localization |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/41267686390587246942 |
work_keys_str_mv |
AT kaiwencheng thestudyofvideoanomalydetectionandlocalization AT zhèngkǎiwén thestudyofvideoanomalydetectionandlocalization AT kaiwencheng yǐngxiàngyìchángxíngwèidezhēncèyǔdìngwèizhīyánjiū AT zhèngkǎiwén yǐngxiàngyìchángxíngwèidezhēncèyǔdìngwèizhīyánjiū AT kaiwencheng studyofvideoanomalydetectionandlocalization AT zhèngkǎiwén studyofvideoanomalydetectionandlocalization |
_version_ |
1718435855659433984 |