Methods for Creating and Exploiting Data Locality

The gap between processor speed and memory latency has led to the use of caches in the memory systems of modern computers. Programs must use the caches efficiently and exploit data locality for maximum performance. Multiprocessors, built from many processing units, are becoming commonplace not only...

Full description

Bibliographic Details
Main Author: Wallin, Dan
Format: Doctoral Thesis
Language:English
Published: Uppsala universitet, Avdelningen för datorteknik 2006
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6837
http://nbn-resolving.de/urn:isbn:91-554-6555-2
id ndltd-UPSALLA1-oai-DiVA.org-uu-6837
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-uu-68372013-01-08T13:07:08ZMethods for Creating and Exploiting Data LocalityengWallin, DanUppsala universitet, Avdelningen för datorteknikUppsala universitet, DatorteknikUppsala : Acta Universitatis Upsaliensis2006data localitytemporal localityspatial localityprefetchingcachecache behaviorcache coherencesnooping protocolspartial differential equationshared-memory multiprocessorchip multiprocessorsimulationComputer engineeringDatorteknikThe gap between processor speed and memory latency has led to the use of caches in the memory systems of modern computers. Programs must use the caches efficiently and exploit data locality for maximum performance. Multiprocessors, built from many processing units, are becoming commonplace not only in large servers but also in smaller systems such as personal computers. Multiprocessors require careful data locality optimizations since accesses from other processors can lead to invalidations and false sharing cache misses. This thesis explores hardware and software approaches for creating and exploiting temporal and spatial locality in multiprocessors. We propose the capacity prefetching technique, which efficiently reduces the number of cache misses but avoids false sharing by distinguishing between cache lines involved in communication from non-communicating cache lines at run-time. Prefetching techniques often lead to increased coherence and data traffic. The new bundling technique avoids one of these drawbacks and reduces the coherence traffic in multiprocessor prefetchers. This is especially important in snoop-based systems where the coherence bandwidth is a scarce resource. Most of the studies have been performed on advanced scientific algorithms. This thesis demonstrates that a cc-NUMA multiprocessor, with hardware data migration and replication optimizations, efficiently exploits the temporal locality in such codes. We further present a method of parallelizing a multigrid Gauss-Seidel partial differential equation solver, which creates temporal locality at the expense of increased communication. Our conclusion is that on modern chip multiprocessors, it is more important to optimize algorithms for data locality than to avoid communication, since communication can take place using a shared cache. Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6837urn:isbn:91-554-6555-2Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, 1651-6214 ; 176application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic data locality
temporal locality
spatial locality
prefetching
cache
cache behavior
cache coherence
snooping protocols
partial differential equation
shared-memory multiprocessor
chip multiprocessor
simulation
Computer engineering
Datorteknik
spellingShingle data locality
temporal locality
spatial locality
prefetching
cache
cache behavior
cache coherence
snooping protocols
partial differential equation
shared-memory multiprocessor
chip multiprocessor
simulation
Computer engineering
Datorteknik
Wallin, Dan
Methods for Creating and Exploiting Data Locality
description The gap between processor speed and memory latency has led to the use of caches in the memory systems of modern computers. Programs must use the caches efficiently and exploit data locality for maximum performance. Multiprocessors, built from many processing units, are becoming commonplace not only in large servers but also in smaller systems such as personal computers. Multiprocessors require careful data locality optimizations since accesses from other processors can lead to invalidations and false sharing cache misses. This thesis explores hardware and software approaches for creating and exploiting temporal and spatial locality in multiprocessors. We propose the capacity prefetching technique, which efficiently reduces the number of cache misses but avoids false sharing by distinguishing between cache lines involved in communication from non-communicating cache lines at run-time. Prefetching techniques often lead to increased coherence and data traffic. The new bundling technique avoids one of these drawbacks and reduces the coherence traffic in multiprocessor prefetchers. This is especially important in snoop-based systems where the coherence bandwidth is a scarce resource. Most of the studies have been performed on advanced scientific algorithms. This thesis demonstrates that a cc-NUMA multiprocessor, with hardware data migration and replication optimizations, efficiently exploits the temporal locality in such codes. We further present a method of parallelizing a multigrid Gauss-Seidel partial differential equation solver, which creates temporal locality at the expense of increased communication. Our conclusion is that on modern chip multiprocessors, it is more important to optimize algorithms for data locality than to avoid communication, since communication can take place using a shared cache.
author Wallin, Dan
author_facet Wallin, Dan
author_sort Wallin, Dan
title Methods for Creating and Exploiting Data Locality
title_short Methods for Creating and Exploiting Data Locality
title_full Methods for Creating and Exploiting Data Locality
title_fullStr Methods for Creating and Exploiting Data Locality
title_full_unstemmed Methods for Creating and Exploiting Data Locality
title_sort methods for creating and exploiting data locality
publisher Uppsala universitet, Avdelningen för datorteknik
publishDate 2006
url http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-6837
http://nbn-resolving.de/urn:isbn:91-554-6555-2
work_keys_str_mv AT wallindan methodsforcreatingandexploitingdatalocality
_version_ 1716509447623278592