当前位置: X-MOL 学术IEEE J. Solid-State Circuits › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Dynamic Execution Neural Network Processor for Fine-Grained Mixed-Precision Model Training Based on Online Quantization Sensitivity Analysis
IEEE Journal of Solid-State Circuits ( IF 4.6 ) Pub Date : 2024-03-28 , DOI: 10.1109/jssc.2024.3377292
Ruoyang Liu 1 , Chenhan Wei 1 , Yixiong Yang 2 , Wenxun Wang 1 , Binbin Yuan 3 , Huazhong Yang 1 , Yongpan Liu 1
Affiliation  

As neural network (NN) training cost red has been growing exponentially over the past decade, developing high-speed and energy-efficient training methods has become an urgent task. Fine-grained mixed-precision low-bit training is the most promising way for high-efficiency training, but it needs dedicated processor designs to overcome the overhead in control, storage, and I/O and remove the power bottleneck in floating-point (FP) units. This article presents a dynamic execution NN processor supporting fine-grained mixed-precision training through an online quantization sensitivity analysis. Three key features are proposed: the quantization-sensitivityaware dynamic execution controller, dynamic bit-width adaptive datapath design, and the low-power multi-level-aligned block- FP unit (BFPU). This chip achieves 13.2-TFLOPS/W energy efficiency and 1.07-TFLOPS/mm2 area efficiency.

中文翻译:


基于在线量化灵敏度分析的细粒度混合精度模型训练动态执行神经网络处理器



随着神经网络(NN)训练成本在过去十年中呈指数级增长,开发高速且节能的训练方法已成为一项紧迫任务。细粒度混合精度低位训练是最有前途的高效率训练方式,但它需要专门的处理器设计来克服控制、存储和I/O方面的开销,并消除浮点的功耗瓶颈( FP)单位。本文提出了一种动态执行神经网络处理器,通过在线量化灵敏度分析支持细粒度混合精度训练。提出了三个关键特性:量化敏感度动态执行控制器、动态位宽自适应数据路径设计和低功耗多级对齐块 FP 单元 (BFPU)。该芯片实现了 13.2-TFLOPS/W 能源效率和 1.07-TFLOPS/mm2 面积效率。
更新日期:2024-03-28
down
wechat
bug