**作者:Jianan Mu, Yi Ren, Wen Wang, Yizhong Hu, Shuai Chen, Chip-Hong Chang, Junfeng Fan, Jing Ye, Yuan Cao, Huawei Li, Xiaowei Li


一、一句话描述该论文解决了什么事情?

  1. 具有较大存储空间和复杂访存需求的数论域变换(NTT)算法,是局限后量子(PQC)和全同态(FHE)算法的主要瓶颈;
  2. 而已有工作提出的专用的并行和流水线NTT架构,可以提高特定参数下NTT的计算效率。然而,这些架构优化策略很难适应安全级别和硬件并行性要求的变化;
  3. 因此,本文提出了一种新的设计框架,可以为任何长度和模数的NTT多项式和具有不同层数的处理元件(PE)或PE阵列生成efficient和high-performance的NTT加速器。

二、背景

N q n d

2.1 基于NTT的多项式乘法

  1. Polynomial multiplication is a computational bottleneck of lattice-based cryptography such as Dilithium and Falcon schemes from the PQC family. In these algorithms, polynomials are defined over the ring $R_{q}[x] = \mathbb{Z}{q}[x]/(x^{N}+1),$ where $\mathbb{Z}{q}$ is a quotient ring with prime $q$, $N$ is an integer and $q$ satisfies $q \equiv 1 \bmod 2N$.
  2. The computational complexity of the schoolbook polynomial multiplication is $O(N^2)$. Fast Fourier Transform (FFT) can reduce this complexity to $O(N\log_2{N})$ but roundoff errors can occur in complex number operations.
  3. NTT can be seen as a Discrete Fourier Transform (DFT) in the finite field. It enables polynomial arithmetic to be carried out in the integer domain with no roundoff error.

Untitled

Untitled

Untitled