Natural Resources Research ( IF 4.8 ) Pub Date : 2024-07-15 , DOI: 10.1007/s11053-024-10379-5 Maximilien Meyrieux , Samer Hmoud , Pim van Geffen , David Kaeter
The ore and waste materials extracted from a mineral deposit during the mining process can have significant variations in their physical and chemical characteristics. The current approaches to geological material characterization are often subjective and usually involve a significant human workload, as there is no optimized, well-defined, and robust methodology to perform this task. This paper proposes a robust, data-driven workflow for geological material characterization. The methodology involves selecting relevant features as a starting point to discriminate between material types. The workflow then employs a robust, state-of-the-art nonlinear dimension reduction (DR) algorithm when the dataset is multidimensional to obtain a two-dimensional embedding. From this two-dimensional embedding, a kernel density estimation (KDE) function is derived. Subsequently, a new clustering algorithm, named ClusterDC, is employed to generate clusters from the KDE function, accurately reflecting geological material types while achieving scalable clustering performance on large drillhole datasets. ClusterDC is a density-based clustering algorithm capable of delineating and ranking high-density zones corresponding to clusters of data samples from a two-dimensional KDE function. The algorithm reduces subjectivity by automatically determining optimal cluster numbers and minimizing reliance on hyperparameters. It also offers hierarchical and flexible clustering, allowing users to group or split clusters, optimally reassign data samples, and identify cluster core points as well as potential outliers. Two case studies were carried out to test the algorithm and demonstrate its application to geochemical drill-core assay data. The results of these case studies demonstrate that the application of ClusterDC in the presented workflow supports the characterization of geological material types based on multi-element geochemistry and thus has the potential to help mining companies optimize downstream processes and mitigate technical risks by improving their understanding of their orebodies.
中文翻译:
CLUSTERDC:一种新的基于密度的聚类算法及其在地质材料表征工作流程中的应用
在采矿过程中从矿床中提取的矿石和废料的物理和化学特性可能存在显着变化。当前的地质材料表征方法通常是主观的,并且通常涉及大量的人力工作量,因为没有优化的、明确的和稳健的方法来执行这项任务。本文提出了一种强大的、数据驱动的地质材料表征工作流程。该方法涉及选择相关特征作为区分材料类型的起点。当数据集是多维时,工作流程会采用稳健、最先进的非线性降维 (DR) 算法来获得二维嵌入。从这个二维嵌入中,导出核密度估计(KDE)函数。随后,采用一种名为 ClusterDC 的新聚类算法从 KDE 函数生成聚类,准确反映地质材料类型,同时在大型钻孔数据集上实现可扩展的聚类性能。 ClusterDC 是一种基于密度的聚类算法,能够对与二维 KDE 函数的数据样本簇相对应的高密度区域进行描述和排序。该算法通过自动确定最佳集群数量并最大限度地减少对超参数的依赖来减少主观性。它还提供分层且灵活的聚类,允许用户对聚类进行分组或拆分,以最佳方式重新分配数据样本,并识别聚类核心点以及潜在的异常值。进行了两个案例研究来测试该算法并证明其在地球化学钻芯分析数据中的应用。 这些案例研究的结果表明,ClusterDC 在所提出的工作流程中的应用支持基于多元素地球化学的地质材料类型表征,因此有可能帮助采矿公司优化下游流程,并通过提高对以下方面的理解来降低技术风险:他们的矿体。