当前位置: X-MOL 学术J. Comb. Optim. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A neural network accelerated optimization method for FPGA
Journal of Combinatorial Optimization ( IF 0.9 ) Pub Date : 2024-06-25 , DOI: 10.1007/s10878-024-01117-x
Zhengwei Hu , Sijie Zhu , Leilei Wang , Wangbin Cao , Zhiyuan Xie

A neural network accelerated optimization method for FPGA hardware platform is proposed. The method realizes the optimized deployment of neural network algorithms for FPGA hardware platforms from three aspects: computational speed, flexible transplantation, and development methods. Replacing multiplication based on Mitchell algorithm not only breaks through the speed bottleneck of neural network hardware acceleration caused by long multiplication period, but also makes the parallel acceleration of neural network algorithm get rid of the dependence on the number of hardware multipliers in FPGA, which can give full play to the advantages of FPGA parallel acceleration and maximize the computing speed. Based on the configurable strategy of neural network parameters, the number of network layers and nodes within layers can be adjusted according to different logical resource of FPGA, improving the flexibility of neural network transplantation. The adoption of HLS development method overcomes the shortcomings of RTL method in designing complex neural network algorithms, such as high difficulty in development and long development cycle. Using the Cyclone V SE 5CSEBA6U23I7 FPGA as the target device, a parameter configurable BP neural network was designed based on the proposed method. The usage of logical resources such as ALUT, Flip-Flop, RAM, and DSP were 39.6%, 40%, 56.9%, and 18.3% of the pre-optimized ones, respectively. The feasibility of the proposed method was verified using MNIST digital recognition and facial recognition as application scenarios. Compare to pre-optimization, the test time of MNIST number recognition is reduced to 67.58%, and the success rate was lost 0.195%. The test time for facial recognition applications was reduced to 69.571%, and the success rate of combining LDA algorithm was lost within 4%.



中文翻译:


一种FPGA神经网络加速优化方法



提出了一种面向FPGA硬件平台的神经网络加速优化方法。该方法从计算速度、灵活移植、开发方式三个方面实现了神经网络算法在FPGA硬件平台上的优化部署。基于Mitchell算法代替乘法,不仅突破了乘法周期长带来的神经网络硬件加速的速度瓶颈,而且使神经网络算法的并行加速摆脱了对FPGA中硬件乘法器数量的依赖,可以充分发挥FPGA并行加速的优势,最大限度地提高计算速度。基于神经网络参数的可配置策略,可以根据FPGA逻辑资源的不同调整网络层数和层内节点数,提高神经网络移植的灵活性。采用HLS开发方法克服了RTL方法在设计复杂神经网络算法时开发难度高、开发周期长的缺点。以Cyclone V SE 5CSEBA6U23I7 FPGA为目标器件,基于该方法设计了参数可配置的BP神经网络。 ALUT、Flip-Flop、RAM、DSP等逻辑资源的使用率分别为优化前的39.6%、40%、56.9%、18.3%。以MNIST数字识别和人脸识别为应用场景验证了所提方法的可行性。与优化前相比,MNIST号码识别的测试时间减少至67.58%,成功率损失0.195%。面部识别应用的测试时间减少到69次。571%,结合LDA算法的成功率损失在4%以内。

更新日期:2024-06-25
down
wechat
bug