Parallel PDE Solvers on cc-NUMA Systems
The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of par...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Uppsala universitet, Avdelningen för teknisk databehandling
2004
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307 |
id |
ndltd-UPSALLA1-oai-DiVA.org-uu-86307 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-uu-863072018-01-14T05:09:35ZParallel PDE Solvers on cc-NUMA SystemsengNordén, MarkusUppsala universitet, Avdelningen för teknisk databehandlingUppsala universitet, Numerisk analys2004Software EngineeringProgramvaruteknikThe current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307IT licentiate theses / Uppsala University, Department of Information Technology, 1404-5117 ; 2004-002application/postscriptinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Software Engineering Programvaruteknik |
spellingShingle |
Software Engineering Programvaruteknik Nordén, Markus Parallel PDE Solvers on cc-NUMA Systems |
description |
The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution. |
author |
Nordén, Markus |
author_facet |
Nordén, Markus |
author_sort |
Nordén, Markus |
title |
Parallel PDE Solvers on cc-NUMA Systems |
title_short |
Parallel PDE Solvers on cc-NUMA Systems |
title_full |
Parallel PDE Solvers on cc-NUMA Systems |
title_fullStr |
Parallel PDE Solvers on cc-NUMA Systems |
title_full_unstemmed |
Parallel PDE Solvers on cc-NUMA Systems |
title_sort |
parallel pde solvers on cc-numa systems |
publisher |
Uppsala universitet, Avdelningen för teknisk databehandling |
publishDate |
2004 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307 |
work_keys_str_mv |
AT nordenmarkus parallelpdesolversonccnumasystems |
_version_ |
1718609092281368576 |