当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Regular decision processes
Artificial Intelligence ( IF 14.4 ) Pub Date : 2024-03-21 , DOI: 10.1016/j.artint.2024.104113
Ronen I. Brafman , Giuseppe De Giacomo

We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize -order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space.

中文翻译:

常规决策流程

我们介绍并研究规则决策过程(RDP),这是一种新的紧凑模型,适用于具有非马尔可夫动态和奖励的领域,其中在语言理论意义上对过去的依赖是规则的。 RDP 是 MDP 和 POMDP 之间的中间模型。它们概括了 阶 MDP,并且可以被视为 POMDP,其中隐藏状态是整个历史的正则函数。在因式 RDP 中,转换和奖励函数是使用有限轨迹上的线性时序逻辑中的公式或使用正则表达式来指定的。这允许使用直观和紧凑的公式来指定对过去的复杂依赖性,并在不指定底层状态空间的情况下构建部分可观察域的模型。
更新日期:2024-03-21
down
wechat
bug