An effective implementation of Strassen’s algorithm using AVX intrinsics for a multicore architecture

This paper proposes an effective implementation of Strassen’s algorithm with AVX intrinsics to augment matrix-matrix multiplication in a multicore system. AVX-2 and FMA3 intrinsic functions are utilized, along with OpenMP, to implement the multiplication kernel of Strassen’s algorithm. Loop tiling...

Full description

Bibliographic Details
Main Authors: Nwe Zin Oo, Panyayot Chaikan
Format: Article
Language:English
Published: Prince of Songkla University 2020-12-01
Series:Songklanakarin Journal of Science and Technology (SJST)
Subjects:
avx
fma
Online Access:https://rdo.psu.ac.th/sjstweb/journal/42-6/26.pdf