Journal of Hazardous Materials ( IF 12.2 ) Pub Date : 2023-09-18 , DOI: 10.1016/j.jhazmat.2023.132577 Mihkel Kotli 1 , Geven Piir 1 , Uko Maran 1
Earthworms are among the most important animals (invertebrates) for soil health. Many chemical substances released into nature for agricultural development, such as pesticides, may have unwanted effects on those organisms. However, it is essential to understand the extent of the impact of chemicals on soil health first and then make the proper decisions for regulatory or commercial purposes. We hypothesize that there is an expressible quantitative structure-activity relationship (QSAR) between the structure of pesticide compounds and the acute toxicity effect of earthworm species Eisenia fetida. The description of this relationship allows for a better assessment of the impact of chemicals on the said earthworm. To describe this relationship, a dataset of chemicals was collected from open-access sources to develop a mathematical model. A novel approach, combining genetic algorithm and Bayesian optimization, was used to select structural features into the model and to optimize model parameters. The final QSAR classification model was created with the Random Forest algorithm and exhibited good prediction Accuracy of 0.78 on training set and 0.80 on test set. The model representation follows FAIR principles and is available on QsarDB.org.
中文翻译:
通过可解释的机器学习农药对蚯蚓致死率的影响
蚯蚓是对土壤健康最重要的动物(无脊椎动物)之一。为了农业发展而释放到自然界的许多化学物质(例如杀虫剂)可能会对这些生物体产生不良影响。然而,必须首先了解化学品对土壤健康的影响程度,然后出于监管或商业目的做出正确的决定。我们假设农药化合物的结构与蚯蚓Eisenia fetida的急性毒性作用之间存在可表达的定量构效关系(QSAR)。这种关系的描述可以更好地评估化学品对所述蚯蚓的影响。为了描述这种关系,从开放获取来源收集了化学品数据集以开发数学模型。采用遗传算法和贝叶斯优化相结合的新方法将结构特征选择到模型中并优化模型参数。最终的 QSAR 分类模型是使用随机森林算法创建的,并在训练集上表现出良好的预测精度,为 0.78,在测试集上为 0.80。模型表示遵循 FAIR 原则,可在 QsarDB.org 上获取。