当前位置:
X-MOL 学术
›
Future Gener. Comput. Syst.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Reducing inference energy consumption using dual complementary CNNs
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2024-11-24 , DOI: 10.1016/j.future.2024.107606 Michail Kinnas, John Violos, Ioannis Kompatsiaris, Symeon Papadopoulos
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2024-11-24 , DOI: 10.1016/j.future.2024.107606 Michail Kinnas, John Violos, Ioannis Kompatsiaris, Symeon Papadopoulos
Energy efficiency of Convolutional Neural Networks (CNNs) has become an important area of research, with various strategies being developed to minimize the power consumption of these models. Previous efforts, including techniques like model pruning, quantization, and hardware optimization, have made significant strides in this direction. However, there remains a need for more effective on device AI solutions that balance energy efficiency with model performance. In this paper, we propose a novel approach to reduce the energy requirements of inference of CNNs. Our methodology employs two small Complementary CNNs that collaborate with each other by covering each other’s “weaknesses” in predictions. If the confidence for a prediction of the first CNN is considered low, the second CNN is invoked with the aim of producing a higher confidence prediction. This dual-CNN setup significantly reduces energy consumption compared to using a single large deep CNN. Additionally, we propose a memory component that retains previous classifications for identical inputs, bypassing the need to re-invoke the CNNs for the same input, further saving energy. Our experiments on a Jetson Nano computer demonstrate an energy reduction of up to 85.8% achieved on modified datasets where each sample was duplicated once. These findings indicate that leveraging a complementary CNN pair along with a memory component effectively reduces inference energy while maintaining high accuracy.
中文翻译:
使用双互补 CNN 降低推理能耗
卷积神经网络 (CNN) 的能效已成为一个重要的研究领域,人们正在开发各种策略来最大限度地降低这些模型的功耗。以前的工作,包括模型修剪、量化和硬件优化等技术,都在这个方向上取得了重大进展。但是,仍然需要更有效的设备 AI 解决方案,以平衡能源效率和模型性能。在本文中,我们提出了一种新的方法来降低 CNN 推理的能量需求。我们的方法采用两个小型互补 CNN,它们通过覆盖彼此预测中的“弱点”来相互协作。如果认为第一个 CNN 的预测置信度较低,则调用第二个 CNN 以产生更高的置信度预测。与使用单个大型深度 CNN 相比,这种双 CNN 设置显著降低了能耗。此外,我们提出了一个内存组件,它保留了相同输入的先前分类,无需为相同的输入重新调用 CNN,从而进一步节省能源。我们在 Jetson Nano 计算机上的实验表明,在修改后的数据集上,每个样品都复制一次,能耗降低了 85.8%。这些发现表明,利用互补的 CNN 对和内存组件可以有效地降低推理能量,同时保持高精度。
更新日期:2024-11-24
中文翻译:
使用双互补 CNN 降低推理能耗
卷积神经网络 (CNN) 的能效已成为一个重要的研究领域,人们正在开发各种策略来最大限度地降低这些模型的功耗。以前的工作,包括模型修剪、量化和硬件优化等技术,都在这个方向上取得了重大进展。但是,仍然需要更有效的设备 AI 解决方案,以平衡能源效率和模型性能。在本文中,我们提出了一种新的方法来降低 CNN 推理的能量需求。我们的方法采用两个小型互补 CNN,它们通过覆盖彼此预测中的“弱点”来相互协作。如果认为第一个 CNN 的预测置信度较低,则调用第二个 CNN 以产生更高的置信度预测。与使用单个大型深度 CNN 相比,这种双 CNN 设置显著降低了能耗。此外,我们提出了一个内存组件,它保留了相同输入的先前分类,无需为相同的输入重新调用 CNN,从而进一步节省能源。我们在 Jetson Nano 计算机上的实验表明,在修改后的数据集上,每个样品都复制一次,能耗降低了 85.8%。这些发现表明,利用互补的 CNN 对和内存组件可以有效地降低推理能量,同时保持高精度。