当前位置:
X-MOL 学术
›
Automatica
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Robust [formula omitted]-learning algorithm for Markov decision processes under Wasserstein uncertainty
Automatica ( IF 4.8 ) Pub Date : 2024-08-02 , DOI: 10.1016/j.automatica.2024.111825 Ariel Neufeld , Julian Sester
Automatica ( IF 4.8 ) Pub Date : 2024-08-02 , DOI: 10.1016/j.automatica.2024.111825 Ariel Neufeld , Julian Sester
We present a novel -learning algorithm tailored to solve distributionally robust Markov decision problems where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.
中文翻译:
Wasserstein不确定性下马尔可夫决策过程的鲁棒[公式省略]学习算法
我们提出了一种新颖的学习算法,专门用于解决分布稳健的马尔可夫决策问题,其中底层马尔可夫决策过程的相应转移概率的模糊集是围绕(可能估计的)参考度量的 Wasserstein 球。我们证明了所提出算法的收敛性,并提供了几个使用真实数据的例子来说明我们算法的易处理性以及在解决随机最优控制问题时考虑分布鲁棒性的好处,特别是当估计的分布被错误指定时在实践中。
更新日期:2024-08-02
中文翻译:
Wasserstein不确定性下马尔可夫决策过程的鲁棒[公式省略]学习算法
我们提出了一种新颖的学习算法,专门用于解决分布稳健的马尔可夫决策问题,其中底层马尔可夫决策过程的相应转移概率的模糊集是围绕(可能估计的)参考度量的 Wasserstein 球。我们证明了所提出算法的收敛性,并提供了几个使用真实数据的例子来说明我们算法的易处理性以及在解决随机最优控制问题时考虑分布鲁棒性的好处,特别是当估计的分布被错误指定时在实践中。