Soil Science-Informed Machine Learning,Geoderma

当前位置： X-MOL 学术 › Geoderma › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Soil Science-Informed Machine Learning
Geoderma ( IF 5.6 ) Pub Date : 2024-11-14 , DOI: 10.1016/j.geoderma.2024.117094
Budiman Minasny, Toshiyuki Bandai, Teamrat A. Ghezzehei, Yin-Chung Huang, Yuxin Ma, Alex B. McBratney, Wartini Ng, Sarem Norouzi, Jose Padarian, Rudiyanto, Amin Sharififar, Quentin Styc, Marliana Widyastuti

Machine learning (ML) applications in soil science have significantly increased over the past two decades, reflecting a growing trend towards data-driven research addressing soil security. This extensive application has mainly focused on enhancing predictions of soil properties, particularly soil organic carbon, and improving the accuracy of digital soil mapping (DSM). Despite these advancements, the application of ML in soil science faces challenges related to data scarcity and the interpretability of ML models. There is a need for a shift towards Soil Science-Informed ML (SoilML) models that use the power of ML but also incorporate soil science knowledge in the training process to make predictions more reliable and generalisable. This paper proposes methodologies for embedding ML models with soil science knowledge to overcome current limitations. Incorporating soil science knowledge into ML models involves using observational priors to enhance training datasets, designing model structures which reflect soil science principles, and supervising model training with soil science-informed loss functions. The informed loss functions include observational constraints, coherency rules such as regularisation to avoid overfitting, and prior or soil-knowledge constraints that incorporate existing information about the parameters or outputs. By way of illustration, we present examples from four fields: digital soil mapping, soil spectroscopy, pedotransfer functions, and dynamic soil property models. We discuss the potential to integrate process-based models for improved prediction, the use of physics-informed neural networks, limitations, and the issue of overparametrisation. These approaches improve the relevance of ML predictions in soil science and enhance the models’ ability to generalise across different scenarios while maintaining soil science principles, transparency and reliability.

中文翻译：

土壤科学依据的机器学习

在过去的二十年里，机器学习（ML）在土壤科学中的应用显著增加，这反映了解决土壤安全问题的数据驱动型研究的增长趋势。这种广泛的应用主要集中在增强对土壤特性的预测，特别是土壤有机碳，以及提高数字土壤制图（DSM）的准确性。尽管取得了这些进步，但 ML 在土壤科学中的应用仍面临与数据稀缺和 ML 模型的可解释性相关的挑战。需要转向土壤科学知情的 ML （SoilML）模型，这些模型使用 ML 的强大功能，但也将土壤科学知识纳入训练过程，以使预测更加可靠和可推广。本文提出了将 ML 模型与土壤科学知识嵌入以克服当前限制的方法。将土壤科学知识整合到 ML 模型中涉及使用观察先验来增强训练数据集，设计反映土壤科学原理的模型结构，以及使用土壤科学知情损失函数监督模型训练。知情损失函数包括观测约束、相干规则（例如避免过度拟合的正则化）以及包含有关参数或输出的现有信息的先验或土壤知识约束。通过说明，我们展示了来自四个领域的例子：数字土壤测绘、土壤光谱学、pedotransfer 函数和动态土壤特性模型。我们讨论了集成基于过程的模型以改进预测的潜力、使用物理信息神经网络、局限性和过度参数化问题。这些方法提高了 ML 预测在土壤科学中的相关性，并增强了模型在不同情景中泛化的能力，同时保持了土壤科学原则、透明度和可靠性。

更新日期：2024-11-14

点击分享查看原文

点击收藏

公开下载

阅读更多本刊新发论文本刊介绍/投稿指南