首页 | 本学科首页   官方微博 | 高级检索  
     

后量子密码CRYSTALS-Kyber的FPGA多路并行优化实现
引用本文:李斌,陈晓杰,冯峰,周清雷. 后量子密码CRYSTALS-Kyber的FPGA多路并行优化实现[J]. 通信学报, 2022, 0(2): 196-207
作者姓名:李斌  陈晓杰  冯峰  周清雷
作者单位:郑州大学计算机与人工智能学院;信息工程大学数学工程与先进计算国家重点实验室
基金项目:国家重点研发计划基金资助项目(No.2016YFB0800100,No.2016YFB0800101);国家自然科学基金资助项目(No.61572444)。
摘    要:在基于格的后量子密码中,多项式乘法运算复杂且耗时,为提高格密码在实际应用中的运算效率,提出了一种后量子密码CRYSTALS-Kyber的FPGA多路并行优化实现。首先,描述了Kyber算法的流程,分析了NTT、INTT及CWM的执行情况。其次,给出了FPGA的整体结构,采用流水线技术设计了蝶形运算单元,并以Barrett模约简和CWM调度优化,提高了计算效率。同时,放置32个蝶形运算单元并行执行,缩短了整体计算周期。最后,对多RAM通道进行了存储优化,以数据的交替存取控制和RAM资源复用,提高了访存效率。此外,采用松耦合架构,以DMA通信实现了整体运算的调度。实验结果和分析表明,所提方案可在44、49、163个时钟周期内完成NTT、INTT及CWM运算,优于其他方案,具有较高的能效比。

关 键 词:后量子密码  CRYSTALS-Kyber  现场可编程门阵列  数论变换  多项式乘法  蝶形运算

FPGA multi-unit parallel optimization and implementation of post-quantum cryptography CRYSTALS-Kyber
LI Bin,CHEN Xiaojie,FENG Feng,ZHOU Qinglei. FPGA multi-unit parallel optimization and implementation of post-quantum cryptography CRYSTALS-Kyber[J]. Journal on Communications, 2022, 0(2): 196-207
Authors:LI Bin  CHEN Xiaojie  FENG Feng  ZHOU Qinglei
Affiliation:(School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China;State Key Laboratory of Mathematical Engineering and Advanced Computing,Information Engineering University,Zhengzhou 450001,China)
Abstract:In lattice-based post-quantum cryptography, polynomial multiplication is complicated and time-consuming. In order to improve the computational efficiency of lattice cryptography in practical applications, an FPGA multi-unit parallel optimization and implementation of post-quantum cryptography CRYSTALS-Kyber was proposed. Firstly, the flow of Kyber algorithm was described and the execution of NTT, INTT and CWM were analyzed. Secondly, the overall structure of FPGA was given, the butterfly arithmetic unit was designed by pipeline technology, and the Barrett modulus reduction and CWM scheduling optimization were used to improve the calculation efficiency. At the same time, 32 butterfly arithmetic units were executed in parallel, which shortens the overall calculation cycle. Finally, the multi-RAM channel was optimized to improve the memory access efficiency with alternate data access control and RAM resource reuse. In addition, with the loosely coupled architecture, the overall operation scheduling was realized by DMA communication. The experimental results and analysis show that the proposed scheme implemented can complete NTT, INTT and CWM operations within 44, 49, and 163 clock cycles, which is superior to other schemes and has high energy efficiency ratio.
Keywords:post-quantum cryptography  CRYSTALS-Kyber  FPGA  NTT  polynomial multiplication  butterfly arithmetic
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号