Parallel PDE Solvers on cc-NUMA Systems

The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of par...

Full description

Bibliographic Details
Main Author: Nordén, Markus
Format: Others
Language:English
Published: Uppsala universitet, Avdelningen för teknisk databehandling 2004
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307
id ndltd-UPSALLA1-oai-DiVA.org-uu-86307
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-uu-863072018-01-14T05:09:35ZParallel PDE Solvers on cc-NUMA SystemsengNordén, MarkusUppsala universitet, Avdelningen för teknisk databehandlingUppsala universitet, Numerisk analys2004Software EngineeringProgramvaruteknikThe current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307IT licentiate theses / Uppsala University, Department of Information Technology, 1404-5117 ; 2004-002application/postscriptinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Software Engineering
Programvaruteknik
spellingShingle Software Engineering
Programvaruteknik
Nordén, Markus
Parallel PDE Solvers on cc-NUMA Systems
description The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution.
author Nordén, Markus
author_facet Nordén, Markus
author_sort Nordén, Markus
title Parallel PDE Solvers on cc-NUMA Systems
title_short Parallel PDE Solvers on cc-NUMA Systems
title_full Parallel PDE Solvers on cc-NUMA Systems
title_fullStr Parallel PDE Solvers on cc-NUMA Systems
title_full_unstemmed Parallel PDE Solvers on cc-NUMA Systems
title_sort parallel pde solvers on cc-numa systems
publisher Uppsala universitet, Avdelningen för teknisk databehandling
publishDate 2004
url http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307
work_keys_str_mv AT nordenmarkus parallelpdesolversonccnumasystems
_version_ 1718609092281368576