Natural Resources Research ( IF 4.8 ) Pub Date : 2024-07-13 , DOI: 10.1007/s11053-024-10375-9 Pengfei Lv , Weiying Chen , Hai Li , Wangting Song
In deep mineral exploration, it is difficult to constrain the complex geological structures using a single geophysical method. To tackle the difficulty, integrated geophysical surveys and joint data interpretation are essential. Machine learning (ML) provides more accurate predictions than traditional methods, especially when dealing with complex data from multiple sources or varied statistical distributions. However, a major challenge in using ML for deep mineral exploration is the scarcity and imbalance of labeled samples, mainly due to budget constraints and the complexity of ore deposits. This issue reduces the accuracy of predictive models and introduces bias. Additionally, limited labeling can lead to difficulties in predicting previously undefined classes in training datasets. To address these challenges, we introduce a robust semisupervised ML framework that integrates diverse geophysical and geological datasets to improve model reliability with limited labeled data. Our approach uses a semisupervised ML variational Gaussian mixture model (SsL-VGMM) to handle issues related to insufficient and imbalanced data. We enhanced the model’s predictive capability for unseen data by introducing a novel penalty factor in the ‘cannot-link’ function. Moreover, we employed Bayesian optimization, focusing on the mean-mixture weight, to avoid local optima during model training. Our model demonstrated high accuracy and efficiency, with classification and prediction accuracies of 95.33% and 87.4%, respectively, in numerical and electromagnetic simulation scenarios. Its effectiveness was further validated by locating Pb–Zn–Ag deposits in Inner Mongolia, supported by actual drilling data. This paper highlights the model’s potential in complex mineral exploration and its significant practical and innovative value for deep mineral exploration.
中文翻译:
SsL-VGMM:用于岩性预测的多源数据融合的半监督机器学习模型
在深部矿产勘查中,利用单一的地球物理方法很难约束复杂的地质结构。为了解决这一难题,综合地球物理调查和联合数据解释至关重要。机器学习 (ML) 提供比传统方法更准确的预测,尤其是在处理来自多个来源或不同统计分布的复杂数据时。然而,使用机器学习进行深层矿产勘探的一个主要挑战是标记样本的稀缺和不平衡,这主要是由于预算限制和矿床的复杂性造成的。这个问题降低了预测模型的准确性并引入了偏差。此外,有限的标签可能会导致难以预测训练数据集中先前未定义的类。为了应对这些挑战,我们引入了一个强大的半监督机器学习框架,该框架集成了不同的地球物理和地质数据集,以利用有限的标记数据提高模型的可靠性。我们的方法使用半监督机器学习变分高斯混合模型 (SsL-VGMM) 来处理与数据不足和不平衡相关的问题。我们通过在“无法链接”函数中引入一种新颖的惩罚因子,增强了模型对未见数据的预测能力。此外,我们采用贝叶斯优化,重点关注平均混合权重,以避免模型训练期间的局部最优。我们的模型表现出较高的准确性和效率,在数值和电磁仿真场景中分类和预测准确率分别为 95.33% 和 87.4%。通过对内蒙古铅锌银矿床的定位并得到实际钻探数据的支持,进一步验证了其有效性。 本文强调了该模型在复杂矿产勘探中的潜力及其对深层矿产勘探的重大实用和创新价值。