465 Comparative analysis of semantic segmentation and deep regression models with supervised pre-training for accurate prediction of pig body weight from video data: Insights from industry-scale datasets,Journal of Animal Science

当前位置： X-MOL 学术 › J. Anim. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

465 Comparative analysis of semantic segmentation and deep regression models with supervised pre-training for accurate prediction of pig body weight from video data: Insights from industry-scale datasets
Journal of Animal Science ( IF 2.7 ) Pub Date : 2024-09-14 , DOI: 10.1093/jas/skae234.472
Ye Bi ₁ , Jianhua Xuan ₁ , Yijian Huang ₂ , Gota Morota ₁

Affiliation

Accurate pig body weight (BW) measurement is essential for producers as it is related to pig growth, health, and marketing, yet conventional manual weighing methods are time-consuming and may cause potential stress to the animals. Although there is a growing trend towards the adoption of three-dimensional cameras coupled with computer vision techniques for pig BW estimation, their validation using industry-scale data are still limited. Among the prevailing methodologies, semantic segmentation and supervised pre-training regression are prominent models currently used. Therefore, the objectives of this study were: 1) to estimate pig BW from repeatedly measured video data obtained from a commercial setting, 2) to compare the performance of the two image analysis methods: thresholding segmentation and deep regression approaches, and 3) to evaluate the predictive ability of BW estimation models. An Intel RealSense D435 camera was installed in the commercial farm to collect top-view videos of 540 pigs biweekly at five different time points over three months. At the same time, manually measured BW records were collected using a digital weighing system. We used an automated video conversion pipeline and fine-tuned YOLOv8 to pre-process the raw depth videos. Subsequently, we acquired a total of 151,756 depth images and depth map files. Adaptive thresholding was applied to segment the pig body from the background. Four image-derived biometric features, including dorsal length, abdominal width, height, and volume, were estimated from the segmented images and fitted using ordinary least squares and random forest models. We applied transfer learning by initializing the weights of five deep learning models, including ResNet50, Xception, EfficientNetV2S, ConvNeXtBase, and Vision Transformer, with pre-trained weights from ImageNet. We then fine-tuned these models on the pig depth images. The last layer of each model was adapted to linear regression, enabling direct estimation of BW for each image without the need for additional image pre-processing steps. We employed random repeated subsampling cross-validation, dividing 80% of the pigs for training and 20% for testing, to evaluate prediction performance at each time point. The best prediction coefficients of determination and mean absolute percentage error for each time point were 0.76, 0.86, 0.90, 0.83, 0.90, and 4.57%, 3.76%, 3.01%, 3.41%, 4.84%, respectively. On average, the Xception model resulted in the best prediction coefficient of determination and mean absolute percentage error of 0.90 and 3.01%. Our results suggest that deep learning-based supervised learning models improve the prediction performance of pig BW from industry-scale depth video data.

中文翻译：

465 语义分割和深度回归模型的比较分析，以及监督式预训练，以从视频数据中准确预测猪体重：来自行业规模数据集的见解

准确的猪体重（BW）测量对生产商来说至关重要，因为它与猪的生长、健康和营销有关，但传统的手动称重方法非常耗时，可能会对动物造成潜在压力。尽管采用三维相机和计算机视觉技术进行猪体重估计的趋势越来越明显，但使用行业规模的数据进行验证仍然有限。在流行的方法中，语义分割和监督训练前回归是目前使用的重要模型。因此，本研究的目标是：1）从商业环境中获得的重复测量视频数据中估计猪体重，2）比较两种图像分析方法的性能：阈值分割和深度回归方法，以及 3）评估体重估计模型的预测能力。在商业农场中安装了英特尔实感 D435 摄像头，以收集 540 头猪在三个月内每两周在 5 个不同时间点的俯视视频。同时，使用数字称重系统收集手动测量的 BW 记录。我们使用了自动视频转换管道和微调的 YOLOv8 来预处理原始深度视频。随后，我们总共获得了 151,756 张深度图像和深度图文件。应用自适应阈值从背景中分割猪的身体。从分割图像中估计出四个图像衍生的生物特征，包括背长、腹宽、高和体积，并使用普通最小二乘法和随机森林模型进行拟合。我们通过使用来自 ImageNet 的预训练权重初始化五个深度学习模型的权重来应用迁移学习，包括 ResNet50、Xception、EfficientNetV2S、ConvNeXtBase 和 Vision Transformer。然后，我们在猪深度图像上微调了这些模型。每个模型的最后一层都适应了线性回归，可以直接估计每张图像的 BW，而无需额外的图像预处理步骤。我们采用随机重复子抽样交叉验证，将 80% 的猪用于训练，20% 用于测试，以评估每个时间点的预测性能。各时间点最佳预测决定系数和平均绝对百分比误差分别为 0.76、0.86、0.90、0.83、0.90 和 4.57%、3.76%、3.01%、3.41%、4.84%。平均而言，Xception 模型产生了最佳的预测决定系数和平均绝对百分比误差 0.90 和 3.01%。我们的结果表明，基于深度学习的监督学习模型提高了工业规模深度视频数据对猪 BW 的预测性能。

更新日期：2024-09-14

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南