当前位置:
X-MOL 学术
›
ACS Photonics
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Metasurface-Generated Large and Arbitrary Analog Convolution Kernels for Accelerated Machine Vision
ACS Photonics ( IF 6.5 ) Pub Date : 2024-12-04 , DOI: 10.1021/acsphotonics.4c01874 Ruiqi Liang, Shuai Wang, Yiying Dong, Liu Li, Ying Kuang, Bohan Zhang, Yuanmu Yang
ACS Photonics ( IF 6.5 ) Pub Date : 2024-12-04 , DOI: 10.1021/acsphotonics.4c01874 Ruiqi Liang, Shuai Wang, Yiying Dong, Liu Li, Ying Kuang, Bohan Zhang, Yuanmu Yang
In the rapidly evolving field of artificial intelligence, convolutional neural networks are essential for tackling complex challenges, such as machine vision and medical diagnosis. Recently, to address the challenges in processing speed and power consumption of conventional digital convolution operations, many optical components have been suggested to replace the digital convolution layer in the neural network, accelerating various machine vision tasks. Nonetheless, the analogous nature of the optical convolution kernel has not been fully explored. Here, we develop a spatial frequency domain training method to create arbitrarily shaped analog convolution kernels using an optical metasurface as the convolution layer, with its receptive field largely surpassing digital convolution kernels. By employing spatial multiplexing, the multiple parallel convolution kernels with both positive and negative weights are generated under the incoherent illumination condition. We experimentally demonstrate a 98.59% classification accuracy on the MNIST data set, with simulations showing 92.63% and 68.67% accuracy on the Fashion-MNIST and CIFAR-10 data sets with additional digital layers. This work underscores the unique advantage of analogue optical convolution, offering a promising avenue to accelerate machine vision tasks, especially in edge devices.
中文翻译:
用于加速机器视觉的超表面生成的大型任意模拟卷积内核
在快速发展的人工智能领域,卷积神经网络对于应对机器视觉和医疗诊断等复杂挑战至关重要。最近,为了解决传统数字卷积运算在处理速度和功耗方面的挑战,人们建议用许多光学元件来取代神经网络中的数字卷积层,从而加速各种机器视觉任务。尽管如此,光学卷积核的类似性质尚未得到充分探索。在这里,我们开发了一种空间频域训练方法,使用光学超表面作为卷积层来创建任意形状的模拟卷积核,其感受野在很大程度上超过了数字卷积核。通过采用空间复用,在不相干照明条件下生成具有正和负权重的多个并行卷积核。我们实验证明 MNIST 数据集的分类准确率为 98.59%,模拟显示 Fashion-MNIST 和 CIFAR-68.67 数据集的准确率为 10%,并带有额外的数字层。这项工作强调了模拟光学卷积的独特优势,为加速机器视觉任务提供了一条有前途的途径,尤其是在边缘设备中。
更新日期:2024-12-04
中文翻译:
用于加速机器视觉的超表面生成的大型任意模拟卷积内核
在快速发展的人工智能领域,卷积神经网络对于应对机器视觉和医疗诊断等复杂挑战至关重要。最近,为了解决传统数字卷积运算在处理速度和功耗方面的挑战,人们建议用许多光学元件来取代神经网络中的数字卷积层,从而加速各种机器视觉任务。尽管如此,光学卷积核的类似性质尚未得到充分探索。在这里,我们开发了一种空间频域训练方法,使用光学超表面作为卷积层来创建任意形状的模拟卷积核,其感受野在很大程度上超过了数字卷积核。通过采用空间复用,在不相干照明条件下生成具有正和负权重的多个并行卷积核。我们实验证明 MNIST 数据集的分类准确率为 98.59%,模拟显示 Fashion-MNIST 和 CIFAR-68.67 数据集的准确率为 10%,并带有额外的数字层。这项工作强调了模拟光学卷积的独特优势,为加速机器视觉任务提供了一条有前途的途径,尤其是在边缘设备中。