Video Encoding and Classification Based on Statistical Learning

博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatical...

Full description

Bibliographic Details
Main Authors:	Chiang, Chen-Kuo, 江振國
Other Authors:	Lai, Shang-Hong
Format:	Others
Language:	en_US
Published:	2011
Online Access:	http://ndltd.ncl.edu.tw/handle/02212445092367000500

id	ndltd-TW-099NTHU5392086
record_format	oai_dc
spelling	ndltd-TW-099NTHU53920862015-10-13T20:23:01Z http://ndltd.ncl.edu.tw/handle/02212445092367000500 Video Encoding and Classification Based on Statistical Learning 基於統計學習的視訊壓縮與分類 Chiang, Chen-Kuo 江振國博士國立清華大學資訊工程學系 99 The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatically increased due to these new components. In this thesis, we propose a general statistical learning approach to reduce the computational cost in video encoder. This approach can be easily applied to many components in video encoder, such as intermode decision, multi-reference motion estimation and intra-mode prediction. First, representative features are chosen according to feature analysis from a number of training video sequences. Then, the selected features are used to train the sub-classifiers for some partial classification problems. After the training is finished, these sub-classifiers are integrated to build a complete classifier. Last, an off-line pre-classification approach is employed to compute the results for all possible combinations of the quantized features. After pre-classifying these features with the learned classifiers, the results are stored as a lookup table. During run-time encoding, features are extracted and quantized. The classification results can be determined directly from the lookup table. Thus, the computation time of encoding can be significantly reduced. The proposed statistical learning based approach is applied to three main components in video encoder to speed up the computation. Video classification can be accomplished while encoding. A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is proposed in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups is presented here. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, the dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. We test the proposed learning-based video encoding system and video categorization algorithm on some real video sequences available in public. Our experiments show the proposed learning-based H.264 encoder is about 4 to 5 times faster than the EPZS algorithm, which is included in the H.264 reference code for its efficiency, for the entire encoding process with slight video quality degradation. In addition, the proposed component learning-based algorithm is more accurate than the previous methods for video classification experiments on real video database. Lai, Shang-Hong 賴尚宏 2011 學位論文 ; thesis 101 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatically increased due to these new components. In this thesis, we propose a general statistical learning approach to reduce the computational cost in video encoder. This approach can be easily applied to many components in video encoder, such as intermode decision, multi-reference motion estimation and intra-mode prediction. First, representative features are chosen according to feature analysis from a number of training video sequences. Then, the selected features are used to train the sub-classifiers for some partial classification problems. After the training is finished, these sub-classifiers are integrated to build a complete classifier. Last, an off-line pre-classification approach is employed to compute the results for all possible combinations of the quantized features. After pre-classifying these features with the learned classifiers, the results are stored as a lookup table. During run-time encoding, features are extracted and quantized. The classification results can be determined directly from the lookup table. Thus, the computation time of encoding can be significantly reduced. The proposed statistical learning based approach is applied to three main components in video encoder to speed up the computation. Video classification can be accomplished while encoding. A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is proposed in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups is presented here. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, the dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. We test the proposed learning-based video encoding system and video categorization algorithm on some real video sequences available in public. Our experiments show the proposed learning-based H.264 encoder is about 4 to 5 times faster than the EPZS algorithm, which is included in the H.264 reference code for its efficiency, for the entire encoding process with slight video quality degradation. In addition, the proposed component learning-based algorithm is more accurate than the previous methods for video classification experiments on real video database.
author2	Lai, Shang-Hong
author_facet	Lai, Shang-Hong Chiang, Chen-Kuo 江振國
author	Chiang, Chen-Kuo 江振國
spellingShingle	Chiang, Chen-Kuo 江振國 Video Encoding and Classification Based on Statistical Learning
author_sort	Chiang, Chen-Kuo
title	Video Encoding and Classification Based on Statistical Learning
title_short	Video Encoding and Classification Based on Statistical Learning
title_full	Video Encoding and Classification Based on Statistical Learning
title_fullStr	Video Encoding and Classification Based on Statistical Learning
title_full_unstemmed	Video Encoding and Classification Based on Statistical Learning
title_sort	video encoding and classification based on statistical learning
publishDate	2011
url	http://ndltd.ncl.edu.tw/handle/02212445092367000500
work_keys_str_mv	AT chiangchenkuo videoencodingandclassificationbasedonstatisticallearning AT jiāngzhènguó videoencodingandclassificationbasedonstatisticallearning AT chiangchenkuo jīyútǒngjìxuéxídeshìxùnyāsuōyǔfēnlèi AT jiāngzhènguó jīyútǒngjìxuéxídeshìxùnyāsuōyǔfēnlèi
_version_	1718047072546979840

Video Encoding and Classification Based on Statistical Learning

Similar Items