Machine Learning ( IF 4.3 ) Pub Date : 2023-09-20 , DOI: 10.1007/s10994-023-06385-y Fabian Fumagalli , Maximilian Muschalik , Eyke Hüllermeier , Barbara Hammer
Explainable artificial intelligence has mainly focused on static learning scenarios so far. We are interested in dynamic scenarios where data is sampled progressively, and learning is done in an incremental rather than a batch mode. We seek efficient incremental algorithms for computing feature importance (FI). Permutation feature importance (PFI) is a well-established model-agnostic measure to obtain global FI based on feature marginalization of absent features. We propose an efficient, model-agnostic algorithm called iPFI to estimate this measure incrementally and under dynamic modeling conditions including concept drift. We prove theoretical guarantees on the approximation quality in terms of expectation and variance. To validate our theoretical findings and the efficacy of our approaches in incremental scenarios dealing with streaming data rather than traditional batch settings, we conduct multiple experimental studies on benchmark data with and without concept drift.
中文翻译:
增量排列特征重要性(iPFI):针对数据流的在线解释
到目前为止,可解释的人工智能主要集中在静态学习场景。我们对动态场景感兴趣,其中数据是逐步采样的,并且学习是以增量而不是批处理模式完成的。我们寻求有效的增量算法来计算特征重要性(FI)。排列特征重要性(PFI)是一种成熟的与模型无关的度量,用于基于缺失特征的特征边缘化来获得全局 FI。我们提出了一种名为 iPFI 的高效、与模型无关的算法,可以在包括概念漂移在内的动态建模条件下增量地估计此测量值。我们在期望和方差方面证明了近似质量的理论保证。为了验证我们的理论发现以及我们的方法在处理流数据而不是传统批量设置的增量场景中的有效性,我们对有或没有概念漂移的基准数据进行了多项实验研究。