Video Encoding and Classification Based on Statistical Learning

博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatical...

Full description

Bibliographic Details
Main Authors: Chiang, Chen-Kuo, 江振國
Other Authors: Lai, Shang-Hong
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/02212445092367000500
id ndltd-TW-099NTHU5392086
record_format oai_dc
spelling ndltd-TW-099NTHU53920862015-10-13T20:23:01Z http://ndltd.ncl.edu.tw/handle/02212445092367000500 Video Encoding and Classification Based on Statistical Learning 基於統計學習的視訊壓縮與分類 Chiang, Chen-Kuo 江振國 博士 國立清華大學 資訊工程學系 99 The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatically increased due to these new components. In this thesis, we propose a general statistical learning approach to reduce the computational cost in video encoder. This approach can be easily applied to many components in video encoder, such as intermode decision, multi-reference motion estimation and intra-mode prediction. First, representative features are chosen according to feature analysis from a number of training video sequences. Then, the selected features are used to train the sub-classifiers for some partial classification problems. After the training is finished, these sub-classifiers are integrated to build a complete classifier. Last, an off-line pre-classification approach is employed to compute the results for all possible combinations of the quantized features. After pre-classifying these features with the learned classifiers, the results are stored as a lookup table. During run-time encoding, features are extracted and quantized. The classification results can be determined directly from the lookup table. Thus, the computation time of encoding can be significantly reduced. The proposed statistical learning based approach is applied to three main components in video encoder to speed up the computation. Video classification can be accomplished while encoding. A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is proposed in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups is presented here. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, the dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. We test the proposed learning-based video encoding system and video categorization algorithm on some real video sequences available in public. Our experiments show the proposed learning-based H.264 encoder is about 4 to 5 times faster than the EPZS algorithm, which is included in the H.264 reference code for its efficiency, for the entire encoding process with slight video quality degradation. In addition, the proposed component learning-based algorithm is more accurate than the previous methods for video classification experiments on real video database. Lai, Shang-Hong 賴尚宏 2011 學位論文 ; thesis 101 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatically increased due to these new components. In this thesis, we propose a general statistical learning approach to reduce the computational cost in video encoder. This approach can be easily applied to many components in video encoder, such as intermode decision, multi-reference motion estimation and intra-mode prediction. First, representative features are chosen according to feature analysis from a number of training video sequences. Then, the selected features are used to train the sub-classifiers for some partial classification problems. After the training is finished, these sub-classifiers are integrated to build a complete classifier. Last, an off-line pre-classification approach is employed to compute the results for all possible combinations of the quantized features. After pre-classifying these features with the learned classifiers, the results are stored as a lookup table. During run-time encoding, features are extracted and quantized. The classification results can be determined directly from the lookup table. Thus, the computation time of encoding can be significantly reduced. The proposed statistical learning based approach is applied to three main components in video encoder to speed up the computation. Video classification can be accomplished while encoding. A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is proposed in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups is presented here. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, the dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. We test the proposed learning-based video encoding system and video categorization algorithm on some real video sequences available in public. Our experiments show the proposed learning-based H.264 encoder is about 4 to 5 times faster than the EPZS algorithm, which is included in the H.264 reference code for its efficiency, for the entire encoding process with slight video quality degradation. In addition, the proposed component learning-based algorithm is more accurate than the previous methods for video classification experiments on real video database.
author2 Lai, Shang-Hong
author_facet Lai, Shang-Hong
Chiang, Chen-Kuo
江振國
author Chiang, Chen-Kuo
江振國
spellingShingle Chiang, Chen-Kuo
江振國
Video Encoding and Classification Based on Statistical Learning
author_sort Chiang, Chen-Kuo
title Video Encoding and Classification Based on Statistical Learning
title_short Video Encoding and Classification Based on Statistical Learning
title_full Video Encoding and Classification Based on Statistical Learning
title_fullStr Video Encoding and Classification Based on Statistical Learning
title_full_unstemmed Video Encoding and Classification Based on Statistical Learning
title_sort video encoding and classification based on statistical learning
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/02212445092367000500
work_keys_str_mv AT chiangchenkuo videoencodingandclassificationbasedonstatisticallearning
AT jiāngzhènguó videoencodingandclassificationbasedonstatisticallearning
AT chiangchenkuo jīyútǒngjìxuéxídeshìxùnyāsuōyǔfēnlèi
AT jiāngzhènguó jīyútǒngjìxuéxídeshìxùnyāsuōyǔfēnlèi
_version_ 1718047072546979840