Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems

博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and sign...

Full description

Bibliographic Details
Main Authors:	Wei-Da Chen, 陳韋達
Other Authors:	Yin-Tsung Hwang
Format:	Others
Language:	en_US
Published:	2013
Online Access:	http://ndltd.ncl.edu.tw/handle/08261397886682974638

id	ndltd-TW-101NCHU5441087
record_format	oai_dc
spelling	ndltd-TW-101NCHU54410872017-10-29T04:34:26Z http://ndltd.ncl.edu.tw/handle/08261397886682974638 Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems 應用於多輸入多輸出正交分頻多工系統預編碼與訊號偵測之快速矩陣分解法設計與晶片實現 Wei-Da Chen 陳韋達博士國立中興大學電機工程學系所 101 Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively. In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed. In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn. Yin-Tsung Hwang 黃穎聰 2013 學位論文 ; thesis 132 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively. In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed. In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn.
author2	Yin-Tsung Hwang
author_facet	Yin-Tsung Hwang Wei-Da Chen 陳韋達
author	Wei-Da Chen 陳韋達
spellingShingle	Wei-Da Chen 陳韋達 Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
author_sort	Wei-Da Chen
title	Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_short	Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_full	Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_fullStr	Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_full_unstemmed	Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_sort	designs and chip implementations of fast matrix decomposition schemes for precoding and signal detection in mimo ofdm systems
publishDate	2013
url	http://ndltd.ncl.edu.tw/handle/08261397886682974638
work_keys_str_mv	AT weidachen designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems AT chénwéidá designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems AT weidachen yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn AT chénwéidá yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn
_version_	1718557636430921728

Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems

Similar Items