当前位置: X-MOL 学术Med. Image Anal. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
EfficientQ: An efficient and accurate post-training neural network quantization method for medical image segmentation
Medical Image Analysis ( IF 10.7 ) Pub Date : 2024-07-22 , DOI: 10.1016/j.media.2024.103277
Rongzhao Zhang 1 , Albert C S Chung 1
Affiliation  

Model quantization is a promising technique that can simultaneously compress and accelerate a deep neural network by limiting its computation bit-width, which plays a crucial role in the fast-growing AI industry. Despite model quantization’s success in producing well-performing low-bit models, the quantization process itself can still be expensive, which may involve a long fine-tuning stage on a large, well-annotated training set. To make the quantization process more efficient in terms of both time and data requirements, this paper proposes a fast and accurate post-training quantization method, namely EfficientQ. We develop this new method with a layer-wise optimization strategy and leverage the powerful alternating direction method of multipliers (ADMM) algorithm to ensure fast convergence. Furthermore, a weight regularization scheme is incorporated to provide more guidance for the optimization of the discrete weights, and a self-adaptive attention mechanism is proposed to combat the class imbalance problem. Extensive comparison and ablation experiments are conducted on two publicly available medical image segmentation datasets, i.e., LiTS and BraTS2020, and the results demonstrate the superiority of the proposed method over various existing post-training quantization methods in terms of both accuracy and optimization speed. Remarkably, with EfficientQ, the quantization of a practical 3D UNet only requires less than 5 min on a single GPU and one data sample. The source code is available at https://github.com/rongzhao-zhang/EfficientQ.

中文翻译:


EfficientQ:一种高效、准确的医学图像分割训练后神经网络量化方法



模型量化是一种很有前景的技术,它可以通过限制计算位宽来同时压缩和加速深度神经网络,这在快速发展的人工智能行业中发挥着至关重要的作用。尽管模型量化在生成性能良好的低位模型方面取得了成功,但量化过程本身仍然很昂贵,这可能涉及在大型、注释良好的训练集上进行长时间的微调阶段。为了使量化过程在时间和数据要求方面更加高效,本文提出了一种快速且准确的训练后量化方法,即EfficientQ。我们采用分层优化策略开发这种新方法,并利用强大的交替方向乘子法(ADMM)算法来确保快速收敛。此外,还结合了权重正则化方案,为离散权重的优化提供更多指导,并提出了自适应注意力机制来解决类别不平衡问题。在两个公开的医学图像分割数据集LiTS和BraTS2020上进行了广泛的比较和消融实验,结果证明了该方法在精度和优化速度方面优于现有的各种训练后量化方法。值得注意的是,利用 EfficientQ,实际 3D UNet 的量化在单个 GPU 和一个数据样本上只需要不到 5 分钟。源代码可在 https://github.com/rongzhao-zhang/EfficientQ 获取。
更新日期:2024-07-22
down
wechat
bug