当前位置:
X-MOL 学术
›
Artif. Intell.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters
Artificial Intelligence ( IF 5.1 ) Pub Date : 2024-08-13 , DOI: 10.1016/j.artint.2024.104201 Aksel Vaaler , Svein Jostein Husa , Daniel Menges , Thomas Nakken Larsen , Adil Rasheed
Artificial Intelligence ( IF 5.1 ) Pub Date : 2024-08-13 , DOI: 10.1016/j.artint.2024.104201 Aksel Vaaler , Svein Jostein Husa , Daniel Menges , Thomas Nakken Larsen , Adil Rasheed
Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents' proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.
中文翻译:
用于安全海上导航的模块化控制架构:使用预测安全过滤器进行强化学习
许多自主系统都对安全至关重要,因此必须拥有一个闭环控制系统,以稳健的方式满足潜在物理限制和安全方面产生的约束。然而,对于现实世界的系统来说,这通常很难实现。例如,海上的自主船舶具有非线性和不确定的动力学,并受到许多随时间变化的环境干扰,例如波浪、海流和风。人们越来越有兴趣使用基于机器学习的方法来使这些系统适应更复杂的场景,但很少有标准框架可以保证此类系统的安全性和稳定性。最近,预测安全滤波器(PSF)已成为一种有前景的方法,可确保基于学习的控制中的约束满足,从而绕过学习算法本身对显式约束处理的需要。安全过滤器方法导致问题的模块化分离,允许以与任务无关的方式使用任意控制策略。该滤波器从主控制器接收潜在不安全的控制动作,并解决优化问题,以计算符合物理和安全约束的建议动作的最小扰动。在这项工作中,我们将强化学习(RL)与海洋导航和控制背景下的预测安全过滤相结合。强化学习代理在各种随机生成的环境中接受路径跟踪和安全遵守方面的训练,而预测安全过滤器则持续监控代理提出的控制操作,并在必要时对其进行修改。组合的 PSF/RL 方案在 Cybership II 的模拟模型上实现,Cybership II 是一艘典型补给船的微型复制品。 评估安全性能和学习率,并与标准的非 PSF RL 代理进行比较。事实证明,预测安全过滤器能够保证船舶安全,同时不会抑制 RL 代理的学习率和性能。
更新日期:2024-08-13
中文翻译:
用于安全海上导航的模块化控制架构:使用预测安全过滤器进行强化学习
许多自主系统都对安全至关重要,因此必须拥有一个闭环控制系统,以稳健的方式满足潜在物理限制和安全方面产生的约束。然而,对于现实世界的系统来说,这通常很难实现。例如,海上的自主船舶具有非线性和不确定的动力学,并受到许多随时间变化的环境干扰,例如波浪、海流和风。人们越来越有兴趣使用基于机器学习的方法来使这些系统适应更复杂的场景,但很少有标准框架可以保证此类系统的安全性和稳定性。最近,预测安全滤波器(PSF)已成为一种有前景的方法,可确保基于学习的控制中的约束满足,从而绕过学习算法本身对显式约束处理的需要。安全过滤器方法导致问题的模块化分离,允许以与任务无关的方式使用任意控制策略。该滤波器从主控制器接收潜在不安全的控制动作,并解决优化问题,以计算符合物理和安全约束的建议动作的最小扰动。在这项工作中,我们将强化学习(RL)与海洋导航和控制背景下的预测安全过滤相结合。强化学习代理在各种随机生成的环境中接受路径跟踪和安全遵守方面的训练,而预测安全过滤器则持续监控代理提出的控制操作,并在必要时对其进行修改。组合的 PSF/RL 方案在 Cybership II 的模拟模型上实现,Cybership II 是一艘典型补给船的微型复制品。 评估安全性能和学习率,并与标准的非 PSF RL 代理进行比较。事实证明,预测安全过滤器能够保证船舶安全,同时不会抑制 RL 代理的学习率和性能。