ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM

Three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. Architecture A0 is based on the synchronization and scheduling facilities, which are provided by the operating system. Architecture A1 introduces a new synch...

Full description

Bibliographic Details
Main Authors:	O. N. Karasik, A. A. Prihozhy
Format:	Article
Language:	English
Published:	Belarusian National Technical University 2017-05-01
Series:	Sistemnyj Analiz i Prikladnaâ Informatika
Subjects:	three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. architecture a0 is based on the synchronization and scheduling facilities, which are provided by the operating system. architecture a1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. architecture a2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. all scheduler architectures are implemented on windows operating systems and based on the user mode scheduling. important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the gaussian elimination. the algorithms differ in the way of the data distribution among threads and by the thread synchronization models. the number of threads varied from 32 to 7936. architecture a1 shows the acceleration of up to 8.65% and the architecture a2 shows the acceleration of up to 11.98% compared to a0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. on the back substitution stage of the algorithms, architecture a1 gives the acceleration of up to 125%, and architecture a2 gives the acceleration of up to 413% compared to architecture a0. the experiments clearly show that the proposed architectures, a1 and a2 outperform a0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. the conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.
Online Access:	https://sapi.bntu.by/jour/article/view/144

id	doaj-b8f5c8f2d5314c13825070e91fd6cd1d
record_format	Article
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	O. N. Karasik A. A. Prihozhy
spellingShingle	O. N. Karasik A. A. Prihozhy ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM Sistemnyj Analiz i Prikladnaâ Informatika three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. architecture a0 is based on the synchronization and scheduling facilities, which are provided by the operating system. architecture a1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. architecture a2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. all scheduler architectures are implemented on windows operating systems and based on the user mode scheduling. important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the gaussian elimination. the algorithms differ in the way of the data distribution among threads and by the thread synchronization models. the number of threads varied from 32 to 7936. architecture a1 shows the acceleration of up to 8.65% and the architecture a2 shows the acceleration of up to 11.98% compared to a0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. on the back substitution stage of the algorithms, architecture a1 gives the acceleration of up to 125%, and architecture a2 gives the acceleration of up to 413% compared to architecture a0. the experiments clearly show that the proposed architectures, a1 and a2 outperform a0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. the conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.
author_facet	O. N. Karasik A. A. Prihozhy
author_sort	O. N. Karasik
title	ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM
title_short	ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM
title_full	ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM
title_fullStr	ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM
title_full_unstemmed	ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM
title_sort	advanced scheduler for cooperative execution of threads on multi-core system
publisher	Belarusian National Technical University
series	Sistemnyj Analiz i Prikladnaâ Informatika
issn	2309-4923 2414-0481
publishDate	2017-05-01
description	Three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. Architecture A0 is based on the synchronization and scheduling facilities, which are provided by the operating system. Architecture A1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. Architecture A2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. All scheduler architectures are implemented on Windows operating systems and based on the User Mode Scheduling. Important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the Gaussian elimination. The algorithms differ in the way of the data distribution among threads and by the thread synchronization models. The number of threads varied from 32 to 7936. Architecture A1 shows the acceleration of up to 8.65% and the architecture A2 shows the acceleration of up to 11.98% compared to A0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. On the back substitution stage of the algorithms, architecture A1 gives the acceleration of up to 125%, and architecture A2 gives the acceleration of up to 413% compared to architecture A0. The experiments clearly show that the proposed architectures, A1 and A2 outperform A0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. The conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.
topic	three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. architecture a0 is based on the synchronization and scheduling facilities, which are provided by the operating system. architecture a1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. architecture a2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. all scheduler architectures are implemented on windows operating systems and based on the user mode scheduling. important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the gaussian elimination. the algorithms differ in the way of the data distribution among threads and by the thread synchronization models. the number of threads varied from 32 to 7936. architecture a1 shows the acceleration of up to 8.65% and the architecture a2 shows the acceleration of up to 11.98% compared to a0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. on the back substitution stage of the algorithms, architecture a1 gives the acceleration of up to 125%, and architecture a2 gives the acceleration of up to 413% compared to architecture a0. the experiments clearly show that the proposed architectures, a1 and a2 outperform a0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. the conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.
url	https://sapi.bntu.by/jour/article/view/144
work_keys_str_mv	AT onkarasik advancedschedulerforcooperativeexecutionofthreadsonmulticoresystem AT aaprihozhy advancedschedulerforcooperativeexecutionofthreadsonmulticoresystem
_version_	1721253163928911872
spelling	doaj-b8f5c8f2d5314c13825070e91fd6cd1d2021-07-29T08:38:32ZengBelarusian National Technical UniversitySistemnyj Analiz i Prikladnaâ Informatika2309-49232414-04812017-05-010141110.21122/2309-4923-2017-1-4-11112ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEMO. N. Karasik0A. A. Prihozhy1Belarusian National Technical UniversityBelarusian National Technical UniversityThree architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. Architecture A0 is based on the synchronization and scheduling facilities, which are provided by the operating system. Architecture A1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. Architecture A2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. All scheduler architectures are implemented on Windows operating systems and based on the User Mode Scheduling. Important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the Gaussian elimination. The algorithms differ in the way of the data distribution among threads and by the thread synchronization models. The number of threads varied from 32 to 7936. Architecture A1 shows the acceleration of up to 8.65% and the architecture A2 shows the acceleration of up to 11.98% compared to A0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. On the back substitution stage of the algorithms, architecture A1 gives the acceleration of up to 125%, and architecture A2 gives the acceleration of up to 413% compared to architecture A0. The experiments clearly show that the proposed architectures, A1 and A2 outperform A0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. The conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.https://sapi.bntu.by/jour/article/view/144three architectures of the cooperative thread scheduler in a multithreaded application that is executed on a multi-core system are considered. architecture a0 is based on the synchronization and scheduling facilities, which are provided by the operating system. architecture a1 introduces a new synchronization primitive and a single queue of the blocked threads in the scheduler, which reduces the interaction activity between the threads and operating system, and significantly speed up the processes of blocking and unblocking the threads. architecture a2 replaces the single queue of blocked threads with dedicated queues, one for each of the synchronizing primitives, extends the number of internal states of the primitive, reduces the inter- dependence of the scheduling threads, and further significantly speeds up the processes of blocking and unblocking the threads. all scheduler architectures are implemented on windows operating systems and based on the user mode scheduling. important experimental results are obtained for multithreaded applications that implement two blocked parallel algorithms of solving the linear algebraic equation systems by the gaussian elimination. the algorithms differ in the way of the data distribution among threads and by the thread synchronization models. the number of threads varied from 32 to 7936. architecture a1 shows the acceleration of up to 8.65% and the architecture a2 shows the acceleration of up to 11.98% compared to a0 architecture for the blocked parallel algorithms computing the triangular form and performing the back substitution. on the back substitution stage of the algorithms, architecture a1 gives the acceleration of up to 125%, and architecture a2 gives the acceleration of up to 413% compared to architecture a0. the experiments clearly show that the proposed architectures, a1 and a2 outperform a0 depending on the number of thread blocking and unblocking operations, which happen during the execution of multi-threaded applications. the conducted computational experiments demonstrate the improvement of parameters of multithreaded applications on a heterogeneous multi-core system due the proposed advanced versions of the thread scheduler.

ADVANCED SCHEDULER FOR COOPERATIVE EXECUTION OF THREADS ON MULTI-CORE SYSTEM

Similar Items