Design and Analysis of Inter-PE Communication on Many-Core Platform
碩士 === 國立清華大學 === 資訊工程學系 === 101 === With the continuous increase in the number of Processing Elements (PEs) in modern many-core platforms, the throughput and reliability of inter-PE communication at application-level has become important issues. In our previous work, we proposed a Networkon- Chip...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2012
|
Online Access: | http://ndltd.ncl.edu.tw/handle/71791278606664137053 |
id |
ndltd-TW-101NTHU5392015 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NTHU53920152015-10-13T21:55:44Z http://ndltd.ncl.edu.tw/handle/71791278606664137053 Design and Analysis of Inter-PE Communication on Many-Core Platform 多核心平台上運算元件間相互通訊之設計與 分析 Chen, Yu-Hsun 陳鈺勳 碩士 國立清華大學 資訊工程學系 101 With the continuous increase in the number of Processing Elements (PEs) in modern many-core platforms, the throughput and reliability of inter-PE communication at application-level has become important issues. In our previous work, we proposed a Networkon- Chip (NoC) based many-core platform which consists of 16 PEs, on-chip communication library, and flow control protocol which can guarantee the reliability of inter-PE communication at application-level. Each PE communicates with each other by using the PE-to-PE core which is the interface connected to NoC. The bottleneck of communication efficiency is the latency of one data transmission because the data needs to be read from local memory by CPU and pushed into the PE-to-PE core. We propose an improved architecture which simplifies the interface between software and hardware, and accesses local memory directly with burst-mode data transmission. In addition, we implement the software-level of flow control protocol into hardware-level. We analyze behaviors of this improved architecture on the corresponding SystemC platform, and attribute the latency of data transmission to the speed of local memory access. The experimental results show that the maximum throughput of inter-PE communication with flow control protocol is 2687.3Mbps, which is 22.6 times faster than 119.1Mbps in our previous work. Using TSMC 0.13μm CMOS technology, area of this improved architecture operating at 100MHz is 19.1K gates, which is 69.2% of previous work. With the comparison of area and speed, this improved architecture has faster data transmission speed and area-efficiency of inter-PE communication. Huang, Chih-Tsun 黃稚存 2012 學位論文 ; thesis 84 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立清華大學 === 資訊工程學系 === 101 === With the continuous increase in the number of Processing Elements (PEs) in modern
many-core platforms, the throughput and reliability of inter-PE communication at
application-level has become important issues. In our previous work, we proposed a Networkon-
Chip (NoC) based many-core platform which consists of 16 PEs, on-chip communication
library, and flow control protocol which can guarantee the reliability of inter-PE communication
at application-level. Each PE communicates with each other by using the PE-to-PE
core which is the interface connected to NoC. The bottleneck of communication efficiency is
the latency of one data transmission because the data needs to be read from local memory
by CPU and pushed into the PE-to-PE core. We propose an improved architecture which
simplifies the interface between software and hardware, and accesses local memory directly
with burst-mode data transmission. In addition, we implement the software-level of flow
control protocol into hardware-level.
We analyze behaviors of this improved architecture on the corresponding SystemC platform,
and attribute the latency of data transmission to the speed of local memory access.
The experimental results show that the maximum throughput of inter-PE communication
with flow control protocol is 2687.3Mbps, which is 22.6 times faster than 119.1Mbps in our
previous work. Using TSMC 0.13μm CMOS technology, area of this improved architecture
operating at 100MHz is 19.1K gates, which is 69.2% of previous work. With the comparison
of area and speed, this improved architecture has faster data transmission speed and
area-efficiency of inter-PE communication.
|
author2 |
Huang, Chih-Tsun |
author_facet |
Huang, Chih-Tsun Chen, Yu-Hsun 陳鈺勳 |
author |
Chen, Yu-Hsun 陳鈺勳 |
spellingShingle |
Chen, Yu-Hsun 陳鈺勳 Design and Analysis of Inter-PE Communication on Many-Core Platform |
author_sort |
Chen, Yu-Hsun |
title |
Design and Analysis of Inter-PE Communication on Many-Core Platform |
title_short |
Design and Analysis of Inter-PE Communication on Many-Core Platform |
title_full |
Design and Analysis of Inter-PE Communication on Many-Core Platform |
title_fullStr |
Design and Analysis of Inter-PE Communication on Many-Core Platform |
title_full_unstemmed |
Design and Analysis of Inter-PE Communication on Many-Core Platform |
title_sort |
design and analysis of inter-pe communication on many-core platform |
publishDate |
2012 |
url |
http://ndltd.ncl.edu.tw/handle/71791278606664137053 |
work_keys_str_mv |
AT chenyuhsun designandanalysisofinterpecommunicationonmanycoreplatform AT chényùxūn designandanalysisofinterpecommunicationonmanycoreplatform AT chenyuhsun duōhéxīnpíngtáishàngyùnsuànyuánjiànjiānxiānghùtōngxùnzhīshèjìyǔfēnxī AT chényùxūn duōhéxīnpíngtáishàngyùnsuànyuánjiànjiānxiānghùtōngxùnzhīshèjìyǔfēnxī |
_version_ |
1718070435960061952 |