Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms

碩士 === 國立清華大學 === 資訊工程學系 === 102 === Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement...

Full description

Bibliographic Details
Main Authors:	Chen, Huan-Wen, 陳煥文
Other Authors:	Huang, Chih-Tsun
Format:	Others
Language:	en_US
Published:	2013
Online Access:	http://ndltd.ncl.edu.tw/handle/62576801445937981882

id	ndltd-TW-102NTHU5392004
record_format	oai_dc
spelling	ndltd-TW-102NTHU53920042015-10-13T22:29:58Z http://ndltd.ncl.edu.tw/handle/62576801445937981882 Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms 應用於多核心平台之可堆疊記憶體存取效率改進與分析 Chen, Huan-Wen 陳煥文碩士國立清華大學資訊工程學系 102 Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement in processor speed exceeds the rate of improvement in DRAM memory speed, that W. Wulf and S. McKee called the phenomenon “memory wall”. Nevertheless, over the past few decades the amount of on-chip cores comes from one to several, and the up-coming NoC-based (most is mesh) many-core architecture no longer blindly upgrades processor’s performance, but takes advantage of parallelism to achieve the throughput requirement with superior cost-effectiveness. Unfortunately, the demand for memory bandwidth or throughput is still increased. Therefore, many engineer try to do their best to enhance the efficiency between memory controller and DRAM devices by proposing better memory scheduling policy, increasing bandwidth and improving the access speed, etc. Recently, the emergence of 3D-stacked DRAM (wide I/O) slightly reduces the speed gap between processor and memory system. But the architecture which used Network-on-Chip as a bridge to connect processors and memory controllers has a characteristic that some DRAM requests from processors may go through very far distance to access memory controller. Based on the above motivation, in this thesis we present an architecture which improves efficiency of accessing stacked memories on many-core platforms. This architecture uses an extra switch network to transport the packets which come from processor to DRAM sub-system and groups few numbers of processor to specify DRAM-channel. By this method, we can alleviate the traffic contention between DRAM-requests and inter-processor communication. We use traditional method as a contrast, that all of DRAM-requests are routed by NoC. Experimental results of SPLASH2 applications demonstrate significant speed up that ranges from 1.13 times to 2.57 times, with cost-affordable crossbar switch network which also applies to the Wide I/O DRAM interface. Huang, Chih-Tsun 黃稚存 2013 學位論文 ; thesis 77 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立清華大學 === 資訊工程學系 === 102 === Because of DRAM is its structural simplicity, high density per unit area and more inexpensive, it’s very suited to be a role of main-memory in computer architecture. However, from a historical point of view, since the DRAM was flourished, the rate of improvement in processor speed exceeds the rate of improvement in DRAM memory speed, that W. Wulf and S. McKee called the phenomenon “memory wall”. Nevertheless, over the past few decades the amount of on-chip cores comes from one to several, and the up-coming NoC-based (most is mesh) many-core architecture no longer blindly upgrades processor’s performance, but takes advantage of parallelism to achieve the throughput requirement with superior cost-effectiveness. Unfortunately, the demand for memory bandwidth or throughput is still increased. Therefore, many engineer try to do their best to enhance the efficiency between memory controller and DRAM devices by proposing better memory scheduling policy, increasing bandwidth and improving the access speed, etc. Recently, the emergence of 3D-stacked DRAM (wide I/O) slightly reduces the speed gap between processor and memory system. But the architecture which used Network-on-Chip as a bridge to connect processors and memory controllers has a characteristic that some DRAM requests from processors may go through very far distance to access memory controller. Based on the above motivation, in this thesis we present an architecture which improves efficiency of accessing stacked memories on many-core platforms. This architecture uses an extra switch network to transport the packets which come from processor to DRAM sub-system and groups few numbers of processor to specify DRAM-channel. By this method, we can alleviate the traffic contention between DRAM-requests and inter-processor communication. We use traditional method as a contrast, that all of DRAM-requests are routed by NoC. Experimental results of SPLASH2 applications demonstrate significant speed up that ranges from 1.13 times to 2.57 times, with cost-affordable crossbar switch network which also applies to the Wide I/O DRAM interface.
author2	Huang, Chih-Tsun
author_facet	Huang, Chih-Tsun Chen, Huan-Wen 陳煥文
author	Chen, Huan-Wen 陳煥文
spellingShingle	Chen, Huan-Wen 陳煥文 Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
author_sort	Chen, Huan-Wen
title	Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_short	Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_full	Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_fullStr	Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_full_unstemmed	Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms
title_sort	efficiency improvement and analysis of accessing stacked memories on many-core platforms
publishDate	2013
url	http://ndltd.ncl.edu.tw/handle/62576801445937981882
work_keys_str_mv	AT chenhuanwen efficiencyimprovementandanalysisofaccessingstackedmemoriesonmanycoreplatforms AT chénhuànwén efficiencyimprovementandanalysisofaccessingstackedmemoriesonmanycoreplatforms AT chenhuanwen yīngyòngyúduōhéxīnpíngtáizhīkěduīdiéjìyìtǐcúnqǔxiàolǜgǎijìnyǔfēnxī AT chénhuànwén yīngyòngyúduōhéxīnpíngtáizhīkěduīdiéjìyìtǐcúnqǔxiàolǜgǎijìnyǔfēnxī
_version_	1718077633646821376

Efficiency Improvement and Analysis of Accessing Stacked Memories on Many-Core Platforms

Similar Items