Design and Analysis of Inter-PE Communication on Many-Core Platform

碩士 === 國立清華大學 === 資訊工程學系 === 101 === With the continuous increase in the number of Processing Elements (PEs) in modern many-core platforms, the throughput and reliability of inter-PE communication at application-level has become important issues. In our previous work, we proposed a Networkon- Chip...

Full description

Bibliographic Details
Main Authors: Chen, Yu-Hsun, 陳鈺勳
Other Authors: Huang, Chih-Tsun
Format: Others
Language:en_US
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/71791278606664137053
id ndltd-TW-101NTHU5392015
record_format oai_dc
spelling ndltd-TW-101NTHU53920152015-10-13T21:55:44Z http://ndltd.ncl.edu.tw/handle/71791278606664137053 Design and Analysis of Inter-PE Communication on Many-Core Platform 多核心平台上運算元件間相互通訊之設計與 分析 Chen, Yu-Hsun 陳鈺勳 碩士 國立清華大學 資訊工程學系 101 With the continuous increase in the number of Processing Elements (PEs) in modern many-core platforms, the throughput and reliability of inter-PE communication at application-level has become important issues. In our previous work, we proposed a Networkon- Chip (NoC) based many-core platform which consists of 16 PEs, on-chip communication library, and flow control protocol which can guarantee the reliability of inter-PE communication at application-level. Each PE communicates with each other by using the PE-to-PE core which is the interface connected to NoC. The bottleneck of communication efficiency is the latency of one data transmission because the data needs to be read from local memory by CPU and pushed into the PE-to-PE core. We propose an improved architecture which simplifies the interface between software and hardware, and accesses local memory directly with burst-mode data transmission. In addition, we implement the software-level of flow control protocol into hardware-level. We analyze behaviors of this improved architecture on the corresponding SystemC platform, and attribute the latency of data transmission to the speed of local memory access. The experimental results show that the maximum throughput of inter-PE communication with flow control protocol is 2687.3Mbps, which is 22.6 times faster than 119.1Mbps in our previous work. Using TSMC 0.13μm CMOS technology, area of this improved architecture operating at 100MHz is 19.1K gates, which is 69.2% of previous work. With the comparison of area and speed, this improved architecture has faster data transmission speed and area-efficiency of inter-PE communication. Huang, Chih-Tsun 黃稚存 2012 學位論文 ; thesis 84 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 資訊工程學系 === 101 === With the continuous increase in the number of Processing Elements (PEs) in modern many-core platforms, the throughput and reliability of inter-PE communication at application-level has become important issues. In our previous work, we proposed a Networkon- Chip (NoC) based many-core platform which consists of 16 PEs, on-chip communication library, and flow control protocol which can guarantee the reliability of inter-PE communication at application-level. Each PE communicates with each other by using the PE-to-PE core which is the interface connected to NoC. The bottleneck of communication efficiency is the latency of one data transmission because the data needs to be read from local memory by CPU and pushed into the PE-to-PE core. We propose an improved architecture which simplifies the interface between software and hardware, and accesses local memory directly with burst-mode data transmission. In addition, we implement the software-level of flow control protocol into hardware-level. We analyze behaviors of this improved architecture on the corresponding SystemC platform, and attribute the latency of data transmission to the speed of local memory access. The experimental results show that the maximum throughput of inter-PE communication with flow control protocol is 2687.3Mbps, which is 22.6 times faster than 119.1Mbps in our previous work. Using TSMC 0.13μm CMOS technology, area of this improved architecture operating at 100MHz is 19.1K gates, which is 69.2% of previous work. With the comparison of area and speed, this improved architecture has faster data transmission speed and area-efficiency of inter-PE communication.
author2 Huang, Chih-Tsun
author_facet Huang, Chih-Tsun
Chen, Yu-Hsun
陳鈺勳
author Chen, Yu-Hsun
陳鈺勳
spellingShingle Chen, Yu-Hsun
陳鈺勳
Design and Analysis of Inter-PE Communication on Many-Core Platform
author_sort Chen, Yu-Hsun
title Design and Analysis of Inter-PE Communication on Many-Core Platform
title_short Design and Analysis of Inter-PE Communication on Many-Core Platform
title_full Design and Analysis of Inter-PE Communication on Many-Core Platform
title_fullStr Design and Analysis of Inter-PE Communication on Many-Core Platform
title_full_unstemmed Design and Analysis of Inter-PE Communication on Many-Core Platform
title_sort design and analysis of inter-pe communication on many-core platform
publishDate 2012
url http://ndltd.ncl.edu.tw/handle/71791278606664137053
work_keys_str_mv AT chenyuhsun designandanalysisofinterpecommunicationonmanycoreplatform
AT chényùxūn designandanalysisofinterpecommunicationonmanycoreplatform
AT chenyuhsun duōhéxīnpíngtáishàngyùnsuànyuánjiànjiānxiānghùtōngxùnzhīshèjìyǔfēnxī
AT chényùxūn duōhéxīnpíngtáishàngyùnsuànyuánjiànjiānxiānghùtōngxùnzhīshèjìyǔfēnxī
_version_ 1718070435960061952