Bitslicing Arithmetic/Boolean Masking Conversions for Fun and Profit

The performance of higher-order masked implementations of lattice-based based key encapsulation mechanisms (KEM) is currently limited by the costly conversions between arithmetic and Boolean masking. While bitslicing has been shown to strongly speed up masked implementations of symmetric primitives...

全面介紹

書目詳細資料
發表在:Transactions on Cryptographic Hardware and Embedded Systems
Main Authors: Olivier Bronchain, Gaëtan Cassiers
格式: Article
語言:英语
出版: Ruhr-Universität Bochum 2022-08-01
主題:
在線閱讀:https://ojs-dev.ub.rub.de/index.php/TCHES/article/view/9831
實物特徵
總結:The performance of higher-order masked implementations of lattice-based based key encapsulation mechanisms (KEM) is currently limited by the costly conversions between arithmetic and Boolean masking. While bitslicing has been shown to strongly speed up masked implementations of symmetric primitives, its use in arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets has never been thoroughly investigated. In this paper, we first show that bitslicing can indeed accelerate existing conversion gadgets. We then optimize these gadgets, exploiting the degrees of freedom offered by bitsliced implementations. As a result, we introduce new arbitrary-order Boolean masked addition, arithmetic-to-Boolean and Boolean-to-arithmetic masking conversion gadgets, each in two variants: modulo 2k and modulo p (for any integers k and p). Practically, our new gadgets achieve a speedup of up to 25x over the state of the art. Turning to the KEM application, we develop the first open-source embedded (Cortex-M4) implementations of Kyber768 and Saber masked at arbitrary order. The implementations based on the new bitsliced gadgets achieve a speedup of 1.8x for Kyber and 3x for Saber, compared to the implementation based on state-of-the-art gadgets. The bottleneck of the bitslice implementations is the masked Keccak-f[1600] permutation.
ISSN:2569-2925