Unlearnable Games and “Satisficing” Decisions: A Simple Model for a Complex World,Physical Review X

当前位置： X-MOL 学术 › Phys. Rev. X › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Unlearnable Games and “Satisficing” Decisions: A Simple Model for a Complex World
Physical Review X ( IF 11.6 ) Pub Date : 2024-06-06 , DOI: 10.1103/physrevx.14.021039
Jérôme Garnier-Brun _{1,

2} , Michael Benzaquen _{1,

2,

3} , Jean-Philippe Bouchaud _{1,

3,

4}

Affiliation

As a schematic model of the complexity economic agents are confronted with, we introduce the “Sherrington-Kirkpatrick game,” a discrete time binary choice model inspired from mean-field spin glasses. We show that, even in a completely static environment, agents are unable to learn collectively optimal strategies. This is either because the learning process gets trapped in a suboptimal fixed point or because learning never converges and leads to a never-ending evolution of agent intentions. Contrarily to the hope that learning might save the standard “rational expectation” framework in economics, we argue that complex situations are generically unlearnable and agents must do with satisficing solutions, as argued long ago by Simon [Q. J. Econ. 69, 99 (1955)]. Only a centralized, omniscient agent endowed with enormous computing power could qualify to determine the optimal strategy of all agents. Using a mix of analytical arguments and numerical simulations, we find that (i) long memory of past rewards is beneficial to learning, whereas overreaction to recent past is detrimental and leads to cycles or chaos; (ii) increased competition (nonreciprocity) destabilizes fixed points and leads first to chaos and, in the high competition limit, to quasicycles; (iii) some amount of randomness in the learning process, perhaps paradoxically, allows the system to reach better collective decisions; (iv) nonstationary, “aging” behavior spontaneously emerges in a large swath of parameter space of our complex but static world. On the positive side, we find that the learning process allows cooperative systems to coordinate around satisficing solutions with rather high (but markedly suboptimal) average reward. However, hypersensitivity to the game parameters makes it impossible to predict ex ante who will be better or worse off in our stylized economy. The statistical description of the space of satisficing solutions is an open problem.

中文翻译：

无法学习的游戏和“令人满意”的决策：复杂世界的简单模型

作为经济主体所面临的复杂性的示意性模型，我们引入了“谢林顿-柯克帕特里克博弈”，这是一种受平均场自旋玻璃启发的离散时间二元选择模型。我们表明，即使在完全静态的环境中，智能体也无法学习集体最优策略。这要么是因为学习过程陷入了次优固定点，要么是因为学习永远不会收敛并导致代理意图永无休止的演变。与学习可能挽救经济学中标准“理性预期”框架的希望相反，我们认为复杂的情况通常是无法学习的，代理人必须找到令人满意的解决方案，正如西蒙很久以前所主张的那样[Q. J.经济。 69, 99 (1955)]。只有具有巨大计算能力的集中式全知智能体才有资格确定所有智能体的最优策略。结合分析论证和数值模拟，我们发现（i）对过去奖励的长期记忆有利于学习，而对最近的过去反应过度是有害的，会导致循环或混乱； (ii) 竞争的加剧（非互惠性）会破坏固定点的稳定性，首先导致混乱，并在高竞争极限下导致准循环； (iii) 学习过程中的一定程度的随机性（也许是矛盾的）允许系统做出更好的集体决策；（iv）非平稳的“老化”行为自发地出现在我们复杂但静态的世界的一大片参数空间中。从积极的一面来看，我们发现学习过程允许合作系统围绕令人满意的解决方案进行协调，并获得相当高（但明显次优）的平均奖励。然而，对游戏参数的过度敏感使得无法事前预测在我们的程式化经济中谁会变得更好或更差。满意解空间的统计描述是一个悬而未决的问题。

更新日期：2024-06-06

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>