Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms

碩士 === 國立清華大學 === 資訊工程學系 === 102 === Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement...

Full description

Bibliographic Details
Main Authors: Chen, Huan-Wen, 陳煥文
Other Authors: Huang, Chih-Tsun
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/62576801445937981882
id ndltd-TW-102NTHU5392004
record_format oai_dc
spelling ndltd-TW-102NTHU53920042015-10-13T22:29:58Z http://ndltd.ncl.edu.tw/handle/62576801445937981882 Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms 應用於多核心平台之可堆疊記憶體存取效率改進與分析 Chen, Huan-Wen 陳煥文 碩士 國立清華大學 資訊工程學系 102 Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement in processor speed exceeds the rate of improvement in DRAM memory speed, that W. Wulf and S. McKee called the phenomenon “memory wall”. Nevertheless, over the past few decades the amount of on-chip cores comes from one to several, and the up-coming NoC-based (most is mesh) many-core architecture no longer blindly upgrades processor’s performance, but takes advantage of parallelism to achieve the throughput requirement with superior cost-effectiveness. Unfortunately, the demand for memory bandwidth or throughput is still increased. Therefore, many engineer try to do their best to enhance the efficiency between memory controller and DRAM devices by proposing better memory scheduling policy, increasing bandwidth and improving the access speed, etc. Recently, the emergence of 3D-stacked DRAM (wide I/O) slightly reduces the speed gap between processor and memory system. But the architecture which used Network-on-Chip as a bridge to connect processors and memory controllers has a characteristic that some DRAM requests from processors may go through very far distance to access memory controller. Based on the above motivation, in this thesis we present an architecture which improves efficiency of accessing stacked memories on many-core platforms. This architecture uses an extra switch network to transport the packets which come from processor to DRAM sub-system and groups few numbers of processor to specify DRAM-channel. By this method, we can alleviate the traffic contention between DRAM-requests and inter-processor communication. We use traditional method as a contrast, that all of DRAM-requests are routed by NoC. Experimental results of SPLASH2 applications demonstrate significant speed up that ranges from 1.13 times to 2.57 times, with cost-affordable crossbar switch network which also applies to the Wide I/O DRAM interface. Huang, Chih-Tsun 黃稚存 2013 學位論文 ; thesis 77 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 資訊工程學系 === 102 === Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement in processor speed exceeds the rate of improvement in DRAM memory speed, that W. Wulf and S. McKee called the phenomenon “memory wall”. Nevertheless, over the past few decades the amount of on-chip cores comes from one to several, and the up-coming NoC-based (most is mesh) many-core architecture no longer blindly upgrades processor’s performance, but takes advantage of parallelism to achieve the throughput requirement with superior cost-effectiveness. Unfortunately, the demand for memory bandwidth or throughput is still increased. Therefore, many engineer try to do their best to enhance the efficiency between memory controller and DRAM devices by proposing better memory scheduling policy, increasing bandwidth and improving the access speed, etc. Recently, the emergence of 3D-stacked DRAM (wide I/O) slightly reduces the speed gap between processor and memory system. But the architecture which used Network-on-Chip as a bridge to connect processors and memory controllers has a characteristic that some DRAM requests from processors may go through very far distance to access memory controller. Based on the above motivation, in this thesis we present an architecture which improves efficiency of accessing stacked memories on many-core platforms. This architecture uses an extra switch network to transport the packets which come from processor to DRAM sub-system and groups few numbers of processor to specify DRAM-channel. By this method, we can alleviate the traffic contention between DRAM-requests and inter-processor communication. We use traditional method as a contrast, that all of DRAM-requests are routed by NoC. Experimental results of SPLASH2 applications demonstrate significant speed up that ranges from 1.13 times to 2.57 times, with cost-affordable crossbar switch network which also applies to the Wide I/O DRAM interface.
author2 Huang, Chih-Tsun
author_facet Huang, Chih-Tsun
Chen, Huan-Wen
陳煥文
author Chen, Huan-Wen
陳煥文
spellingShingle Chen, Huan-Wen
陳煥文
Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
author_sort Chen, Huan-Wen
title Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_short Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_full Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_fullStr Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_full_unstemmed Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_sort efficiency improvement and analysis of accessing stacked memories on many-core platforms
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/62576801445937981882
work_keys_str_mv AT chenhuanwen efficiencyimprovementandanalysisofaccessingstackedmemoriesonmanycoreplatforms
AT chénhuànwén efficiencyimprovementandanalysisofaccessingstackedmemoriesonmanycoreplatforms
AT chenhuanwen yīngyòngyúduōhéxīnpíngtáizhīkěduīdiéjìyìtǐcúnqǔxiàolǜgǎijìnyǔfēnxī
AT chénhuànwén yīngyòngyúduōhéxīnpíngtáizhīkěduīdiéjìyìtǐcúnqǔxiàolǜgǎijìnyǔfēnxī
_version_ 1718077633646821376