Precision Agriculture ( IF 5.4 ) Pub Date : 2024-09-04 , DOI: 10.1007/s11119-024-10184-3 Dhahi Al-Shammari , Yang Chen , Niranjan S. Wimalathunge , Chen Wang , Si Yang Han , Thomas F. A. Bishop
Introduction
Context Data-driven models (DDMs) are increasingly used for crop yield prediction due to their ability to capture complex patterns and relationships. DDMs rely heavily on data inputs to provide predictions. Despite their effectiveness, DDMs can be complemented by inputs derived from mechanistic models (MMs).
Methods
This study investigated enhancing the predictive quality of DDMs by using as features a combination of MMs outputs, specifically biomass and soil moisture, with conventional data sources like satellite imagery, weather, and soil information. Four experiments were performed with different datasets being used for prediction: Experiment 1 combined MM outputs with conventional data; Experiment 2 excluded MM outputs; Experiment 3 was the same as Experiment 1 but all conventional temporal data were omitted; Experiment 4 utilised solely MM outputs. The research encompassed ten field-years of wheat and chickpea yield data, applying the eXtreme Gradient Boosting (XGBOOST) algorithm for model fitting. Performance was evaluated using root mean square error (RMSE) and the concordance correlation coefficient (CCC).
Results and conclusions
The validation results showed that the XGBOOST model had similar predictive power for both crops in Experiments 1, 2, and 3. For chickpeas, the CCC ranged from 0.89 to 0.91 and the RMSE from 0.23 to 0.25 t ha−1. For wheat, the CCC ranged from 0.87 to 0.92 and the RMSE from 0.29 to 0.35 t ha−1. However, Experiment 4 significantly reduced the model's accuracy, with CCCs dropping to 0.47 for chickpeas and 0.36 for wheat, and RMSEs increasing to 0.46 and 0.65 t ha−1, respectively. Ultimately, Experiments 1, 2, and 3 demonstrated comparable effectiveness, but Experiment 3 is recommended for achieving similar predictive quality with a simpler, more interpretable model using biomass and soil moisture alongside non-temporal conventional features.
中文翻译:
将机械模型输出纳入数据驱动模型的产量预测特征:小麦和鹰嘴豆的案例研究
介绍
背景数据驱动模型 (DDM) 由于能够捕获复杂的模式和关系,越来越多地用于作物产量预测。 DDM 严重依赖数据输入来提供预测。尽管 DDM 很有效,但可以通过机械模型 (MM) 的输入来补充。
方法
这项研究调查了通过使用 MM 输出(特别是生物量和土壤湿度)与卫星图像、天气和土壤信息等传统数据源的组合作为特征来提高 DDM 的预测质量。使用不同的数据集进行了四个实验用于预测:实验 1 将 MM 输出与常规数据相结合;实验2排除MM输出;实验3与实验1相同,但省略了所有常规时间数据;实验 4 仅使用 MM 输出。该研究涵盖了 10 个田年的小麦和鹰嘴豆产量数据,应用极限梯度提升 (XGBOOST) 算法进行模型拟合。使用均方根误差 (RMSE) 和一致性相关系数 (CCC) 评估性能。
结果和结论
验证结果表明,XGBOOST 模型对实验 1、2 和 3 中的两种作物具有相似的预测能力。对于鹰嘴豆,CCC 范围为 0.89 至 0.91,RMSE 范围为 0.23 至 0.25 t ha -1 。对于小麦,CCC 范围为 0.87 至 0.92,RMSE 范围为 0.29 至 0.35 t ha -1 。然而,实验4显着降低了模型的准确性,鹰嘴豆的CCC下降至0.47,小麦的CCC下降至0.36,而RMSE分别增加至0.46和0.65 t ha -1 。最终,实验 1、2 和 3 表现出了相当的有效性,但建议实验 3 通过使用生物量和土壤湿度以及非时间常规特征的更简单、更可解释的模型来实现类似的预测质量。