Video Encoding and Classification Based on Statistical Learning

博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatical...

Full description

Bibliographic Details
Main Authors: Chiang, Chen-Kuo, 江振國
Other Authors: Lai, Shang-Hong
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/02212445092367000500
Description
Summary:博士 === 國立清華大學 === 資訊工程學系 === 99 === The latest video coding standard of Joint Video Team (JVT) significantly outperforms previous standards in terms of coding bitrate and video quality, because it adopts several new compression techniques. However, the computational complexity is also dramatically increased due to these new components. In this thesis, we propose a general statistical learning approach to reduce the computational cost in video encoder. This approach can be easily applied to many components in video encoder, such as intermode decision, multi-reference motion estimation and intra-mode prediction. First, representative features are chosen according to feature analysis from a number of training video sequences. Then, the selected features are used to train the sub-classifiers for some partial classification problems. After the training is finished, these sub-classifiers are integrated to build a complete classifier. Last, an off-line pre-classification approach is employed to compute the results for all possible combinations of the quantized features. After pre-classifying these features with the learned classifiers, the results are stored as a lookup table. During run-time encoding, features are extracted and quantized. The classification results can be determined directly from the lookup table. Thus, the computation time of encoding can be significantly reduced. The proposed statistical learning based approach is applied to three main components in video encoder to speed up the computation. Video classification can be accomplished while encoding. A novel component-level dictionary learning framework which exploits image group characteristics within sparse coding is proposed in this work. Unlike previous methods, which select the dictionaries that best reconstruct the data, an energy minimization formulation that jointly optimizes the learning of both sparse dictionary and component level importance within one unified framework to give a discriminative representation for image groups is presented here. The importance measures how well each feature component represents the image group property with the dictionary by using histogram information. Then, the dictionaries are updated iteratively to reduce the influence of unimportant components, thus refining the sparse representation for each image group. We test the proposed learning-based video encoding system and video categorization algorithm on some real video sequences available in public. Our experiments show the proposed learning-based H.264 encoder is about 4 to 5 times faster than the EPZS algorithm, which is included in the H.264 reference code for its efficiency, for the entire encoding process with slight video quality degradation. In addition, the proposed component learning-based algorithm is more accurate than the previous methods for video classification experiments on real video database.