Hierarchical federated deep reinforcement learning based joint communication and computation for UAV situation awareness,Vehicular Communications

当前位置： X-MOL 学术 › Veh. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hierarchical federated deep reinforcement learning based joint communication and computation for UAV situation awareness
Vehicular Communications ( IF 5.8 ) Pub Date : 2024-11-06 , DOI: 10.1016/j.vehcom.2024.100853
Haitao Li, Jiawei Huang

The computation-intensive situational awareness (SA) task of unmanned aerial vehicle (UAV) is greatly affected by its limited power and computing capability. To solve this challenge, we consider the joint communication and computation (JCC) design for UAV network in this paper. Firstly, a multi-objective optimization (MOO) model, which can optimize UAV computation offloading, transmit power, and local computation resources simultaneously, is built to minimize energy consumption and task execution delay. Then, we develop Thompson sampling based double-DQN (TDDQN) learning algorithm which allows the agent to explore more deeply and effectively, and propose a joint optimization algorithm that combines TDDQN and sequential least squares quadratic programming (SLSQP) to handle the MOO problem. Finally, to enhance the training speed and quality, we incorporate federated learning (FL) into the presented joint optimization algorithm and propose hierarchical federated TDDQN with SLSQP (HF TDDQN-S) to implement the JCC design. Simulation results show that the introduced HF TDDQN-S can efficiently learn the best JCC strategy and minimize the average cost contrasted with the DDQN with SLSQP (DDQN-S) and TDDQN with SLSPQ (TDDQN-S) approach, and achieve the low average delay SA with power efficient.

中文翻译：

基于分层联邦深度强化学习的无人机态势感知联合通信与计算

无人机（UAV）的计算密集型态势感知（SA）任务受其有限的功率和计算能力影响很大。为了解决这一挑战，我们在本文中考虑了无人机网络的联合通信和计算（JCC）设计。首先，构建多目标优化（MOO）模型，可以同时优化无人机计算卸载、传输功率和本地计算资源，以最小化能耗和任务执行延迟。然后，我们开发了基于汤普森采样的双 DQN （TDDQN）学习算法，使代理能够更深入、更有效地探索，并提出了一种结合 TDDQN 和顺序最小二乘二次规划（SLSQP）的联合优化算法来处理 MOO 问题。最后，为了提高训练速度和质量，我们将联邦学习（FL）纳入所提出的联合优化算法中，并提出带有 SLSQP （HF TDDQN-S）的分层联邦 TDDQN 来实现 JCC 设计。仿真结果表明，与采用SLSQP的DDQN（DDQN-S）和采用SLSPQ的TDDQN（TDDQN-S）方法相比，引入的HF TDDQN-S能够有效地学习最佳JCC策略并最大限度地降低平均成本，并实现低功耗的低平均延迟SA。

更新日期：2024-11-06

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南