Effect of Q-learning on the evolution of cooperation behavior in collective motion: An improved Vicsek model,Applied Mathematics and Computation

当前位置： X-MOL 学术 › Appl. Math. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Effect of Q-learning on the evolution of cooperation behavior in collective motion: An improved Vicsek model
Applied Mathematics and Computation ( IF 3.5 ) Pub Date : 2024-07-30 , DOI: 10.1016/j.amc.2024.128956
Chengjie Wang , Juan Deng , Hui Zhao , Li Li

There have been numerous studies on collective behavior, among which communication between agents can have a great impact on both the payoff and the cost of making decisions. Research usually focuses on how to improve the collective synchronization rate or accelerate the process of cooperation under given communication cost constraints. In this context, evolutionary game theory (EGT) and reinforcement learning (RL) arise as essential frameworks for tackling this intricate problem. In this study, an adapted Vicsek model is introduced, wherein agents exhibit varying movement patterns contingent on their chosen strategies. Each agent gains a payoff determined by the advantages of collective motion juxtaposed with the cost of communicating with neighboring agents. Individuals choose the objective agents based on the Q-learning strategy and then adapt their strategies following the Fermi rule. The research reveals that the utmost level of cooperation and synchronization can be attained at an optimal communication radius after applying Q-learning. Similar conclusions have been drawn from research on the influence of random noise and relative cost. Different cost functions were considered in the study to demonstrate the robustness of the proposed model and conclusions under a wide range of conditions. ()

中文翻译：

Q-学习对集体运动中合作行为演化的影响：改进的 Vicsek 模型

关于集体行为的研究有很多，其中主体之间的沟通会对决策的收益和成本产生很大的影响。研究通常集中在如何在给定的通信成本约束下提高集体同步率或加速合作进程。在这种背景下，进化博弈论（EGT）和强化学习（RL）成为解决这一复杂问题的基本框架。在这项研究中，引入了一种适应的 Vicsek 模型，其中代理根据其选择的策略表现出不同的运动模式。每个智能体获得的回报是由集体运动的优势以及与相邻智能体通信的成本决定的。个人根据 Q 学习策略选择目标代理，然后根据费米规则调整策略。研究表明，应用 Q-learning 后，可以在最佳通信半径下实现最高水平的合作和同步。关于随机噪声和相对成本的影响的研究也得出了类似的结论。研究中考虑了不同的成本函数，以证明所提出的模型和结论在各种条件下的稳健性。 ()

更新日期：2024-07-30

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南