Allocation of SISAL program graphs to processors using BLAS
There are a number of well known techniques for extracting parallelism from a given program. They range from hardware implementations, building restructuring compilers or reorganizing of programs so as to specify all the available parallelism. The success rate of any of the known techniques is rathe...
Main Author: | |
---|---|
Other Authors: | |
Language: | en_US |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/1957/35764 |
id |
ndltd-ORGSU-oai-ir.library.oregonstate.edu-1957-35764 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-ORGSU-oai-ir.library.oregonstate.edu-1957-357642012-12-14T03:14:22ZAllocation of SISAL program graphs to processors using BLASRaisinghani, Manoj H.Computer architectureProgramming languages (Electronic computers)Parallel processing (Electronic computers)There are a number of well known techniques for extracting parallelism from a given program. They range from hardware implementations, building restructuring compilers or reorganizing of programs so as to specify all the available parallelism. The success rate of any of the known techniques is rather poor over all types of programs. This has pushed the research community to explore new languages and design different architectures to exploit program parallelism. The principles of dataflow architectures have addressed the problem of exploiting parallelism in systems by executing dataflow graphs. These graphs or programs represent data dependencies among instructions and execution of the graph proceeds in a data-driven manner. That is, an instruction is executed as soon as all its operands are available, without waiting for any program counter to sequence its execution, as is the case in conventional von Neumann architectures. In this thesis, data flow graphs are generated during the intermediate compilation of a functional language called SISAL (Streams and Iterations in a Single Assignment Language). The Intermediate Form (IFl) is a graphical language consisting of multiple acyclic function graphs that represent a given program. Each graph consists of a sequence of nodes and edges. The nodes specify the operation and the edges indicate the dependencies between the nodes. The graphs are further connected to each other by means of implicit dependencies. The Automator package developed in this project, preprocesses these multiple IF1 graphs and translates them into a single connected graph. It converts all implicit dependencies into actual ones. Additionally, complex language constructs like For All, loops and if-then-else are treated in special ways together with their nested levels by the Automator. There is virtually no limit to the number of nested levels that can be translated by this package. The Automator's prime contribution is in translating real programs written in SISAL into a specified format required by an allocation algorithm called the Balanced Layered Allocation Scheme (BLAS). BLAS partitions a connected graph into independent tasks and assigns them to processors in a multicomputer system. The problem of program allocation lies in maximizing parallelism while minimizing interprocessor communication costs. Hence, allocation is based on the best choice of communication to execution ratio for each task. BLAS utilizes heuristic rules to find a balance between computation and communication costs in the target system. Here the target architecture is a simulated nCUBE 3E computer, having a hypercube topology. Simulations show that, BLAS is effective in reducing the overall execution time of a program by considering the communication costs on the execution times. The results will help in understanding the effects in packing nodes (grain-packing), routing issues in the network and in general, the allocation problem to any processor in a network. In addition, tasks have also been assigned to adjacent processors only, instead of any processor on the hypercube network. The adjacent allocation to processors helps to determine trade-offs required between achieved speed-ups and the time it takes to completely allocate large graphs on compilation.Graduation date: 1994Lee, Ben2012-12-13T20:24:53Z2012-12-13T20:24:53Z1994-04-071994-04-07Thesis/Dissertationhttp://hdl.handle.net/1957/35764en_US |
collection |
NDLTD |
language |
en_US |
sources |
NDLTD |
topic |
Computer architecture Programming languages (Electronic computers) Parallel processing (Electronic computers) |
spellingShingle |
Computer architecture Programming languages (Electronic computers) Parallel processing (Electronic computers) Raisinghani, Manoj H. Allocation of SISAL program graphs to processors using BLAS |
description |
There are a number of well known techniques for extracting parallelism from a
given program. They range from hardware implementations, building restructuring
compilers or reorganizing of programs so as to specify all the available parallelism. The
success rate of any of the known techniques is rather poor over all types of programs.
This has pushed the research community to explore new languages and design different
architectures to exploit program parallelism.
The principles of dataflow architectures have addressed the problem of exploiting
parallelism in systems by executing dataflow graphs. These graphs or programs represent
data dependencies among instructions and execution of the graph proceeds in a data-driven
manner. That is, an instruction is executed as soon as all its operands are
available, without waiting for any program counter to sequence its execution, as is the
case in conventional von Neumann architectures.
In this thesis, data flow graphs are generated during the intermediate compilation of
a functional language called SISAL (Streams and Iterations in a Single Assignment
Language). The Intermediate Form (IFl) is a graphical language consisting of multiple
acyclic function graphs that represent a given program. Each graph consists of a
sequence of nodes and edges. The nodes specify the operation and the edges indicate the
dependencies between the nodes. The graphs are further connected to each other by
means of implicit dependencies.
The Automator package developed in this project, preprocesses these multiple IF1
graphs and translates them into a single connected graph. It converts all implicit
dependencies into actual ones. Additionally, complex language constructs like For All,
loops and if-then-else are treated in special ways together with their nested levels by the
Automator. There is virtually no limit to the number of nested levels that can be
translated by this package.
The Automator's prime contribution is in translating real programs written in SISAL
into a specified format required by an allocation algorithm called the Balanced Layered
Allocation Scheme (BLAS).
BLAS partitions a connected graph into independent tasks and assigns them to
processors in a multicomputer system. The problem of program allocation lies in
maximizing parallelism while minimizing interprocessor communication costs. Hence,
allocation is based on the best choice of communication to execution ratio for each task.
BLAS utilizes heuristic rules to find a balance between computation and communication
costs in the target system. Here the target architecture is a simulated nCUBE 3E
computer, having a hypercube topology.
Simulations show that, BLAS is effective in reducing the overall execution time of
a program by considering the communication costs on the execution times. The results
will help in understanding the effects in packing nodes (grain-packing), routing issues in
the network and in general, the allocation problem to any processor in a network. In
addition, tasks have also been assigned to adjacent processors only, instead of any
processor on the hypercube network. The adjacent allocation to processors helps to
determine trade-offs required between achieved speed-ups and the time it takes to
completely allocate large graphs on compilation. === Graduation date: 1994 |
author2 |
Lee, Ben |
author_facet |
Lee, Ben Raisinghani, Manoj H. |
author |
Raisinghani, Manoj H. |
author_sort |
Raisinghani, Manoj H. |
title |
Allocation of SISAL program graphs to processors using BLAS |
title_short |
Allocation of SISAL program graphs to processors using BLAS |
title_full |
Allocation of SISAL program graphs to processors using BLAS |
title_fullStr |
Allocation of SISAL program graphs to processors using BLAS |
title_full_unstemmed |
Allocation of SISAL program graphs to processors using BLAS |
title_sort |
allocation of sisal program graphs to processors using blas |
publishDate |
2012 |
url |
http://hdl.handle.net/1957/35764 |
work_keys_str_mv |
AT raisinghanimanojh allocationofsisalprogramgraphstoprocessorsusingblas |
_version_ |
1716393819300167680 |