Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems
博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and sign...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/08261397886682974638 |
id |
ndltd-TW-101NCHU5441087 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NCHU54410872017-10-29T04:34:26Z http://ndltd.ncl.edu.tw/handle/08261397886682974638 Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems 應用於多輸入多輸出正交分頻多工系統預編碼與訊號偵測之快速矩陣分解法設計與晶片實現 Wei-Da Chen 陳韋達 博士 國立中興大學 電機工程學系所 101 Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively. In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed. In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn. Yin-Tsung Hwang 黃穎聰 2013 學位論文 ; thesis 132 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
博士 === 國立中興大學 === 電機工程學系所 === 101 === Multiple Input Multiple Output (MIMO) systems often impose tremendous computing overheads in the form of matrix operations to the base band signal processing. This becomes a formidable barrier in real time system implementation. In particular, precoding and signal detection are the two most computation-intensive modules. In this dissertation, we start with an investigation on various matrix decomposition schemes commonly used in MIMO signal processing. The applications of these decomposition schemes on MIMO signal detection and precoding are first reviewed in chapter 2. In particular, QR decomposition and geometric mean decomposition are chosen specifically for the applications in QR-blast based MIMO signal detection and MIMO signal pre-coding, respectively.
In the QR decomposition part, two versions of the design are presented in chapter 3 and chapter 4, respectively. The first one indicates a high throughput, fully parallel Complex-valued QR Decomposition (CQRD) design using real-valued Givens rotations only. The simplicity in computing complexity against various decomposition schemes is shown. Via a carefully plotted scheduling, one CQRD computation can be completed in 8 clock cycles. Sized 2 × 2 and 4 × 4 chip designs largely following the IEEE 802.11n standard are developed. The implementation results in TSMC 0.18 um process technology show that both designs are capable of performing 15M CQRDs per second. The second CQRF design features a minimum mean square error (MMSE) enhancement of the first one. By applying an additional DSP folding technique, the design takes only four clock cycles to perform a 4x4 complex-valued MMSE-QR decomposition. The ASIC fabrication in a TSMC 0.18µm process technology and the FPGA implementations in two types of FPGA devices (Xilinx and Altera) are developed.
In the GMD part, two versions of the efficient computing scheme are developed in chapter 5 and chapter 6. Unlike conventional SVD based GMD algorithms, both schemes use matrix bi-diagonalization rather than SVD as the pre-processing step. They also feature lower computing complexities, permutation-free operations, and hardware sharing between the pre-coding and the signal detection modules. The first version of the GMD computing scheme adopts a progressive approach and obtains the GMD result incrementally starting from a 2×2 sub-matrix. The second version of the GMD scheme adopts a divide-and-conquer computing strategy. Computing complexity analyses indicate at least 30% more computing efficiency than other SVD based GMD computing schemes. In chapter 7, the hardware implementation is addressed. The scheme is mapped to a fully parallel and deeply pipelined architecture where one GMD computation of a 4×4 complex-valued matrix can be accomplished every 4 clock cycles. It also features a joint design supporting two computing modes, i.e. QRD for signal decoding and GMD for precoding. Chip implementation in TSMC 90nm CMOS technology shows that, with a maximum clock frequency up to 170MHz, the design can perform 42.5M GMD or QRD computations per second. Finally, in chapter 8, the conclusion and the future work of this dissertation are drawn.
|
author2 |
Yin-Tsung Hwang |
author_facet |
Yin-Tsung Hwang Wei-Da Chen 陳韋達 |
author |
Wei-Da Chen 陳韋達 |
spellingShingle |
Wei-Da Chen 陳韋達 Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
author_sort |
Wei-Da Chen |
title |
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
title_short |
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
title_full |
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
title_fullStr |
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
title_full_unstemmed |
Designs and Chip Implementations of Fast Matrix Decomposition Schemes for Precoding and Signal Detection in MIMO OFDM Systems |
title_sort |
designs and chip implementations of fast matrix decomposition schemes for precoding and signal detection in mimo ofdm systems |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/08261397886682974638 |
work_keys_str_mv |
AT weidachen designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems AT chénwéidá designsandchipimplementationsoffastmatrixdecompositionschemesforprecodingandsignaldetectioninmimoofdmsystems AT weidachen yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn AT chénwéidá yīngyòngyúduōshūrùduōshūchūzhèngjiāofēnpínduōgōngxìtǒngyùbiānmǎyǔxùnhàozhēncèzhīkuàisùjǔzhènfēnjiěfǎshèjìyǔjīngpiànshíxiàn |
_version_ |
1718557636430921728 |