Nature Machine Intelligence ( IF 18.8 ) Pub Date : 2024-09-03 , DOI: 10.1038/s42256-024-00879-7 Chengdong Ma , Aming Li , Yali Du , Hao Dong , Yaodong Yang
The primary challenge in the development of large-scale artificial intelligence (AI) systems lies in achieving scalable decision-making—extending the AI models while maintaining sufficient performance. Existing research indicates that distributed AI can improve scalability by decomposing complex tasks and distributing them across collaborative nodes. However, previous technologies suffered from compromised real-world applicability and scalability due to the massive requirement of communication and sampled data. Here we develop a model-based decentralized policy optimization framework, which can be efficiently deployed in multi-agent systems. By leveraging local observation through the agent-level topological decoupling of global dynamics, we prove that this decentralized mechanism achieves accurate estimations of global information. Importantly, we further introduce model learning to reinforce the optimal policy for monotonic improvement with a limited amount of sampled data. Empirical results on diverse scenarios show the superior scalability of our approach, particularly in real-world systems with hundreds of agents, thereby paving the way for scaling up AI systems.
中文翻译:
用于大规模网络控制的高效且可扩展的强化学习
开发大规模人工智能(AI)系统的主要挑战在于实现可扩展的决策——扩展人工智能模型,同时保持足够的性能。现有研究表明,分布式人工智能可以通过分解复杂任务并将其分布在协作节点上来提高可扩展性。然而,由于通信和采样数据的大量需求,以前的技术在现实世界中的适用性和可扩展性受到了影响。在这里,我们开发了一个基于模型的去中心化策略优化框架,可以有效地部署在多智能体系统中。通过全局动态的代理级拓扑解耦来利用局部观察,我们证明这种去中心化机制可以实现对全局信息的准确估计。重要的是,我们进一步引入模型学习来强化有限采样数据的单调改进的最优策略。不同场景的实证结果表明,我们的方法具有卓越的可扩展性,特别是在具有数百个代理的现实系统中,从而为扩展人工智能系统铺平了道路。