Low-Cost Hardware-Sharing Design and Chip Implementation of Fast Multiple Integer Transforms for Multi-Standard Video Codec

碩士 === 國立中興大學 === 電機工程學系所 === 99 === In this thesis, the fast multiple integer transforms algorithms and their hardware sharing designs of H.264/AVC, AVS, VC-1, and MPEG-1/2/4 are proposed by using the matrix operations, which are the row/column permutations, the decompositions with the sparse mat...

Full description

Bibliographic Details
Main Authors: Shun-Ji Hsu, 許順吉
Other Authors: 范志鵬
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/80321149368629568548
Description
Summary:碩士 === 國立中興大學 === 電機工程學系所 === 99 === In this thesis, the fast multiple integer transforms algorithms and their hardware sharing designs of H.264/AVC, AVS, VC-1, and MPEG-1/2/4 are proposed by using the matrix operations, which are the row/column permutations, the decompositions with the sparse matrices, and the matrix offset computations. By factorizations and shift-and-addition computations, the proposed 1-D hardware sharing transform scheme is achieved without multiplications, which can save the area for hardware designs. In addition, through decomposing the original transform matrices into the product of the sparse matrices, the computational complexities can be reduced. To implement the hardware design of the 2-D transforms, the two-stage row-column wise scheme is applied to our design with the proposed 1-D hardware sharing architecture and the transpose memory, which is a single register array. In the first stage, the 1-D columns of the input data are transformed and the transformed data are transferred into the transpose memory. In the second stage, the 1-D rows of the first stage outputs, which are obtained from the transpose memory, are transformed consecutively. For multiple-standard video codec, the hardware cost of the proposed 1-D hardware sharing inverse, forward, and inverse/forward transform designs reduces gate counts by 45%,51%, and 49%, respectively, compared with that of the individual and separate realizations. Then the hardware cost of the proposed 2-D hardware sharing inverse, forward, and inverse/forward transform designs requires 95188, 62591, 132655 gate counts, respectively, and can process up to 125MHz operational frequency. According to the synthesis results, the throughput rate achieves 1000M pixels/sec at 125MHz.