HETEROGENIOUS BLOCKED ALL-PAIRS SHORTEST PATHS ALGORITHM
The problem of finding the shortest paths between all pairs of vertices in a weighted directed graph is considered. The algorithms of Dijkstra and Floyd-Warshall, homogeneous block and parallel algorithms and other algorithms of solving this problem are known. A new heterogeneous block algorithm is...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Belarusian National Technical University
2017-11-01
|
Series: | Sistemnyj Analiz i Prikladnaâ Informatika |
Subjects: | |
Online Access: | https://sapi.bntu.by/jour/article/view/178 |
Summary: | The problem of finding the shortest paths between all pairs of vertices in a weighted directed graph is considered. The algorithms of Dijkstra and Floyd-Warshall, homogeneous block and parallel algorithms and other algorithms of solving this problem are known. A new heterogeneous block algorithm is proposed which considers various types of blocks and takes into account the shared hierarchical memory organization and multi-core processors for calculating each type of block. The proposed heterogeneous block computing algorithms are compared with the generally accepted homogeneous universal block calculation algorithm at theoretical and experimental levels. The main emphasis is on using the nature of the heterogeneity, the interaction of blocks during computation and the variation in block size, the size of the block matrix and the total number of blocks in order to identify the possibility of reducing the amount of computation performed during the calculation of the block, reducing the activity of the processor’s cache memory and determining the influence of the calculation time of each block type on the total execution time of the heterogeneous block algorithm. A recurrent resynchronized algorithm for calculating the diagonal block (D0) is proposed, which improves the use of the processor’s cache and reduces the number of iterations up to 3 times that are necessary to calculate the diagonal block, which implies the acceleration in calculating the diagonal block up to 60%. For more efficient work with the cache memory, variants of permutation of the basic loops k-i-j in the algorithms of calculating the blocks of the cross (C1 and C2) and the updated blocks (U3) are proposed. These permutations in combination with the proposed algorithm for calculating the diagonal block reduce the total runtime of the heterogeneous block algorithm to 13% on average against the homogeneous block algorithm. |
---|---|
ISSN: | 2309-4923 2414-0481 |