Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems

博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and sign...

Full description

Bibliographic Details
Main Authors: Wei-Da Chen, 陳韋達
Other Authors: Yin-Tsung Hwang
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/08261397886682974638
id ndltd-TW-101NCHU5441087
record_format oai_dc
spelling ndltd-TW-101NCHU54410872017-10-29T04:34:26Z http://ndltd.ncl.edu.tw/handle/08261397886682974638 Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems 應用於多輸入多輸出正交分頻多工系統預編碼與訊號偵測之快速矩陣分解法設計與晶片實現 Wei-Da Chen 陳韋達 博士 國立中興大學 電機工程學系所 101 Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively. In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed. In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn. Yin-Tsung Hwang 黃穎聰 2013 學位論文 ; thesis 132 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively. In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed. In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn.
author2 Yin-Tsung Hwang
author_facet Yin-Tsung Hwang
Wei-Da Chen
陳韋達
author Wei-Da Chen
陳韋達
spellingShingle Wei-Da Chen
陳韋達
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
author_sort Wei-Da Chen
title Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_short Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_full Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_fullStr Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_full_unstemmed Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
title_sort designs and chip implementations of fast matrix decomposition schemes for precoding and signal detection in mimo ofdm systems
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/08261397886682974638
work_keys_str_mv AT weidachen designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems
AT chénwéidá designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems
AT weidachen yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn
AT chénwéidá yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn
_version_ 1718557636430921728