Parallel PDE Solvers on cc-NUMA Systems

The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of par...

Full description

Bibliographic Details
Main Author:	Nordén, Markus
Format:	Others
Language:	English
Published:	Uppsala universitet, Avdelningen för teknisk databehandling 2004
Subjects:	Software Engineering Programvaruteknik
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307

id	ndltd-UPSALLA1-oai-DiVA.org-uu-86307
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-uu-863072018-01-14T05:09:35ZParallel PDE Solvers on cc-NUMA SystemsengNordén, MarkusUppsala universitet, Avdelningen för teknisk databehandlingUppsala universitet, Numerisk analys2004Software EngineeringProgramvaruteknikThe current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution. Licentiate thesis, comprehensive summaryinfo:eu-repo/semantics/masterThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307IT licentiate theses / Uppsala University, Department of Information Technology, 1404-5117 ; 2004-002application/postscriptinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Software Engineering Programvaruteknik
spellingShingle	Software Engineering Programvaruteknik Nordén, Markus Parallel PDE Solvers on cc-NUMA Systems
description	The current trend in parallel computers is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA). In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important. The questions posed in this thesis are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose? Most of the application codes studied address the Euler equations using a finite difference method and a finite volume method respectively and are parallelized with OpenMP. Comparisons are made with an alternative implementation using MPI and with PDE solvers implemented with OpenMP that solve other equations using different numerical methods. The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This can be achieved through self optimization provided in the system or through migrate-on-next-touch directives that could be inserted automatically by the compiler. We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution.
author	Nordén, Markus
author_facet	Nordén, Markus
author_sort	Nordén, Markus
title	Parallel PDE Solvers on cc-NUMA Systems
title_short	Parallel PDE Solvers on cc-NUMA Systems
title_full	Parallel PDE Solvers on cc-NUMA Systems
title_fullStr	Parallel PDE Solvers on cc-NUMA Systems
title_full_unstemmed	Parallel PDE Solvers on cc-NUMA Systems
title_sort	parallel pde solvers on cc-numa systems
publisher	Uppsala universitet, Avdelningen för teknisk databehandling
publishDate	2004
url	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86307
work_keys_str_mv	AT nordenmarkus parallelpdesolversonccnumasystems
_version_	1718609092281368576

Parallel PDE Solvers on cc-NUMA Systems

Similar Items