COLLECTIVE COMMUNICATION AND BARRIER SYNCHRONIZATION ON NVIDIA CUDA GPU
GPUs (Graphics Processing Units) employ a multi-threaded execution model using multiple SIMD cores. Compared to use of a single SIMD engine, this architecture can scale to more processing elements. However, GPUs sacrifice the timing properties which made barrier synchronization implicit and collecti...
Main Author: | Rivera-Polanco, Diego Alejandro |
---|---|
Format: | Others |
Published: |
UKnowledge
2009
|
Subjects: | |
Online Access: | http://uknowledge.uky.edu/gradschool_theses/635 http://uknowledge.uky.edu/cgi/viewcontent.cgi?article=1639&context=gradschool_theses |
Similar Items
-
Performance Metrics Analysis of GamingAnywhere with GPU accelerated Nvidia CUDA
by: Sreenibha Reddy, Byreddy
Published: (2018) -
Performance Metrics Analysis of GamingAnywhere with GPU accelerated NVIDIA CUDA
by: Sreenibha Reddy, Byreddy
Published: (2018) -
Performance Metrics Analysis of GamingAnywhere with GPU acceletayed NVIDIA CUDA using gVirtuS
by: Zaahid, Mohammed
Published: (2018) -
Implementing method of moments
on a GPGPU using Nvidia CUDA
by: Virk, Bikram
Published: (2010) -
A Comparative Study of the Implementation of SJF and SRT Algorithms on the GPU Processor Using CUDA
by: Youness Rtal, et al.
Published: (2021-02-01)