当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evolving interpretable decision trees for reinforcement learning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2023-12-16 , DOI: 10.1016/j.artint.2023.104057
Vinícius G. Costa , Jorge Pérez-Aracil , Sancho Salcedo-Sanz , Carlos E. Pedreira

In recent years, reinforcement learning (RL) techniques have achieved great success in many different applications. However, their heavy reliance on complex deep neural networks makes most RL models uninterpretable, limiting their application in domains where trust and security are important. To address this challenge, we propose MENS-DT-RL, an algorithm capable of constructing interpretable models for RL via the evolution of decision tree (DT) models. MENS-DT-RL uses a multi-method ensemble algorithm to evolve univariate DTs, guiding the process with a fitness metric that prioritizes interpretability and consistent high performance. Three different initializations for the MENS-DT-RL are proposed, including the use of Imitation Learning (IL) techniques, and a novel pruning approach that reduces solution size without compromising performance. To evaluate the proposed approach, we compare it with other models from the literature on three benchmark tasks from the OpenAI Gym library, as well as on a fertilization problem inspired by real-world crop management. To the best of our knowledge, the proposed scheme is the first to solve the Lunar Lander benchmark with both interpretability and a high confidence rate (90% of episodes are successful), as well as the first to solve the Mountain Car environment with a tree of only 7 nodes. On the real-world task, the proposed MENS-DT-RL is able to produce solutions with the same quality as deep RL policies, with the added bonus of interpretability. We also analyze the best solutions found by the algorithm and show that they are not only interpretable but also diverse in their behavior, empowering the end user with the choice of which model to apply. Overall, the findings show that the proposed approach is capable of producing high-quality transparent models for RL, achieving interpretability without losing performance.



中文翻译:

用于强化学习的进化可解释决策树

近年来,强化学习(RL)技术在许多不同的应用中取得了巨大的成功。然而,它们对复杂深度神经网络的严重依赖使得大多数强化学习模型无法解释,从而限制了它们在信任和安全性很重要的领域中的应用。为了应对这一挑战,我们提出了 MENS-DT-RL,这是一种能够通过决策树 (DT) 模型的演化来构建可解释的 RL 模型的算法。MENS-DT-RL 使用多方法集成算法来演化单变量 DT,并通过优先考虑可解释性和一致高性能的适应度指标来指导该过程。提出了 MENS-DT-RL 的三种不同的初始化,包括使用模仿学习 (IL) 技术,以及一种新颖的修剪方法,可在不影响性能的情况下减小解决方案的大小。为了评估所提出的方法,我们将其与 OpenAI Gym 库中三个基准任务的文献中的其他模型以及受现实世界作物管理启发的施肥问题进行比较。据我们所知,所提出的方案是第一个以可解释性和高置信率(90%的情节成功)解决Lunar Lander基准的方案,也是第一个用树解决Mountain Car环境的方案只有 7 个节点。在现实世界的任务中,所提出的 MENS-DT-RL 能够产生与深度 RL 策略相同质量的解决方案,并具有可解释性的额外优势。我们还分析了算法找到的最佳解决方案,并表明它们不仅可解释,而且行为多样化,使最终用户能够选择应用哪个模型。总的来说,研究结果表明,所提出的方法能够为强化学习生成高质量透明模型,在不损失性能的情况下实现可解释性。

更新日期:2023-12-16
down
wechat
bug