Offloading in V2X with road side units: Deep reinforcement learning,Vehicular Communications

当前位置： X-MOL 学术 › Veh. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Offloading in V2X with road side units: Deep reinforcement learning
Vehicular Communications ( IF 5.8 ) Pub Date : 2024-12-05 , DOI: 10.1016/j.vehcom.2024.100862
Widhi Yahya, Ying-Dar Lin, Faysal Marzuk, Piotr Chołda, Yuan-Cheng Lai

Traffic offloading is crucial for reducing computing latency in distributed edge systems such as vehicle-to-everything (V2X) networks, which use roadside units (RSUs) and access network mobile edge computing (AN-MEC) with ML agents. Traffic offloading is part of the control plane problem, which requires fast decision-making in complex V2X systems. This study presents a novel ratio-based offloading strategy using the twin delayed deep deterministic policy gradient (TD3) algorithm to optimize offloading ratios in a two-tier V2X system, enabling computation at both RSUs and the edge. The offloading optimization covers both vertical and horizontal offloading, introducing a continuous search space that needs fast decision-making to accommodate fluctuating traffic in complex V2X systems. We developed a V2X environment to evaluate the performance of the offloading agent, incorporating latency models, state and action definitions, and reward structures. A comparative analysis with metaheuristic simulated annealing (SA) is conducted, and the impact of single versus multiple offloading agents with deployment options at a centralized central office (CO) is examined. Evaluation results indicate that TD3's decision time is five orders of magnitude faster than SA. For 10 and 50 sites, SA takes 602 and 20,421 seconds, respectively, while single-agent TD3 requires 4 to 24 milliseconds and multi-agent TD3 takes 1 to 3 milliseconds. The average latency for SA ranges from 0.18 to 0.32 milliseconds, single-agent TD3 from 0.26 to 0.5 milliseconds, and multi-agent TD3 from 0.22 to 0.45 milliseconds, demonstrating that TD3 approximates SA performance with initial training.

中文翻译：

在 V2X 中使用路边单元卸载：深度强化学习

流量卸载对于减少分布式边缘系统（例如车联网（V2X）网络）中的计算延迟至关重要，该系统使用路边单元（RSU）和接入网络移动边缘计算（AN-MEC）以及 ML 代理。流量卸载是控制平面问题的一部分，需要在复杂的 V2X 系统中快速做出决策。本研究提出了一种新颖的基于比率的卸载策略，使用双延迟深度确定性策略梯度（TD3）算法来优化两层 V2X 系统中的卸载比率，从而能够在 RSU 和边缘进行计算。卸载优化涵盖垂直和水平卸载，引入了一个连续的搜索空间，需要快速决策以适应复杂 V2X 系统中波动的流量。我们开发了一个 V2X 环境来评估卸载代理的性能，其中包含延迟模型、状态和操作定义以及奖励结构。进行了与元启发式模拟退火（SA）的比较分析，并检查了在集中式中央办公室（CO）具有部署选项的单个与多个卸载代理的影响。评估结果表明，TD3 的决策时间比 SA 快 5 个数量级。对于 10 个和 50 个站点，SA 分别需要 602 秒和 20421 秒，而单代理 TD3 需要 4 到 24 毫秒，多代理 TD3 需要 1 到 3 毫秒。SA 的平均延迟范围为 0.18 到 0.32 毫秒，单代理 TD3 为 0.26 到 0.5 毫秒，多代理 TD3 为 0.22 到 0.45 毫秒，这表明 TD3 接近初始训练的 SA 性能。

更新日期：2024-12-05

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南