A Dynamic Execution Neural Network Processor for Fine-Grained Mixed-Precision Model Training Based on Online Quantization Sensitivity Analysis
IEEE Journal of Solid-State Circuits ( IF 4.6 ) Pub Date : 2024-03-28 , DOI: 10.1109/jssc.2024.3377292
Ruoyang Liu 1 , Chenhan Wei 1 , Yixiong Yang 2 , Wenxun Wang 1 , Binbin Yuan 3 , Huazhong Yang 1 , Yongpan Liu 1

As neural network (NN) training cost red has been growing exponentially over the past decade, developing high-speed and energy-efficient training methods has become an urgent task. Fine-grained mixed-precision low-bit training is the most promising way for high-efficiency training, but it needs dedicated processor designs to overcome the overhead in control, storage, and I/O and remove the power bottleneck in floating-point (FP) units. This article presents a dynamic execution NN processor supporting fine-grained mixed-precision training through an online quantization sensitivity analysis. Three key features are proposed: the quantization-sensitivityaware dynamic execution controller, dynamic bit-width adaptive datapath design, and the low-power multi-level-aligned block- FP unit (BFPU). This chip achieves 13.2-TFLOPS/W energy efficiency and 1.07-TFLOPS/mm2 area efficiency.



随着神经网络(NN)训练成本在过去十年中呈指数级增长,开发高速且节能的训练方法已成为一项紧迫任务。细粒度混合精度低位训练是最有前途的高效率训练方式,但它需要专门的处理器设计来克服控制、存储和I/O方面的开销,并消除浮点的功耗瓶颈( FP)单位。本文提出了一种动态执行神经网络处理器,通过在线量化灵敏度分析支持细粒度混合精度训练。提出了三个关键特性:量化敏感度动态执行控制器、动态位宽自适应数据路径设计和低功耗多级对齐块 FP 单元 (BFPU)。该芯片实现了 13.2-TFLOPS/W 能源效率和 1.07-TFLOPS/mm2 面积效率。