Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Adaptive safe reinforcement learning‐enabled optimization of battery fast‐charging protocols
AIChE Journal ( IF 3.5 ) Pub Date : 2024-09-25 , DOI: 10.1002/aic.18605 Myisha A. Chowdhury, Saif S.S. Al‐Wahaibi, Qiugang Lu
AIChE Journal ( IF 3.5 ) Pub Date : 2024-09-25 , DOI: 10.1002/aic.18605 Myisha A. Chowdhury, Saif S.S. Al‐Wahaibi, Qiugang Lu
Optimizing charging protocols is critical for reducing battery charging time and decelerating battery degradation in applications such as electric vehicles. Recently, reinforcement learning (RL) methods have been adopted for such purposes. However, RL‐based methods may not ensure system (safety) constraints, which can cause irreversible damages to batteries and reduce their lifetime. To this end, this article proposes an adaptive and safe RL framework to optimize fast charging strategies while respecting safety constraints with a high probability. In our method, any unsafe action that the RL agent decides will be projected into a safety region by solving a constrained optimization problem. The safety region is constructed using adaptive Gaussian process (GP) models, consisting of static and dynamic GPs, that learn from online experience to adaptively account for any changes in battery dynamics. Simulation results show that our method can charge the batteries rapidly with constraint satisfaction under varying operating conditions.
中文翻译:
支持自适应安全强化学习的电池快速充电协议优化
优化充电协议对于减少电动汽车等应用中的电池充电时间和减缓电池退化至关重要。最近,强化学习(RL)方法已被用于此类目的。然而,基于强化学习的方法可能无法确保系统(安全)约束,这可能会对电池造成不可逆转的损害并缩短其使用寿命。为此,本文提出了一种自适应且安全的 RL 框架,以优化快速充电策略,同时以高概率尊重安全约束。在我们的方法中,强化学习代理决定的任何不安全行为都将通过解决约束优化问题来投射到安全区域中。安全区域是使用自适应高斯过程 (GP) 模型构建的,该模型由静态和动态 GP 组成,可以从在线经验中学习,自适应地考虑电池动态的任何变化。仿真结果表明,我们的方法可以在不同的操作条件下在满足约束的情况下快速为电池充电。
更新日期:2024-09-25
中文翻译:
支持自适应安全强化学习的电池快速充电协议优化
优化充电协议对于减少电动汽车等应用中的电池充电时间和减缓电池退化至关重要。最近,强化学习(RL)方法已被用于此类目的。然而,基于强化学习的方法可能无法确保系统(安全)约束,这可能会对电池造成不可逆转的损害并缩短其使用寿命。为此,本文提出了一种自适应且安全的 RL 框架,以优化快速充电策略,同时以高概率尊重安全约束。在我们的方法中,强化学习代理决定的任何不安全行为都将通过解决约束优化问题来投射到安全区域中。安全区域是使用自适应高斯过程 (GP) 模型构建的,该模型由静态和动态 GP 组成,可以从在线经验中学习,自适应地考虑电池动态的任何变化。仿真结果表明,我们的方法可以在不同的操作条件下在满足约束的情况下快速为电池充电。