An adaptive meta-imitation learning-based recommendation environment simulator: A case study on ship-cargo matching,Information Fusion

当前位置： X-MOL 学术 › Inform. Fusion › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An adaptive meta-imitation learning-based recommendation environment simulator: A case study on ship-cargo matching
Information Fusion ( IF 14.7 ) Pub Date : 2024-10-18 , DOI: 10.1016/j.inffus.2024.102740
Guangyao Pang , Jiehang Xie , Fei Hao

High-quality shipping is one of the effective ways for sustainable cities in inland river basins to improve transportation efficiency and reduce energy consumption. Currently, the biggest challenge faced by shipping is the high empty-ship rate, which makes it impossible to directly apply machine learning methods due to the cold-start problem. Although some researchers have tried to utilize deep reinforcement learning(DRL)-based recommendation that do not rely on manually labeled data to alleviate the cold-start problem, progress has been slow due to the lack of available training environment. Therefore, this paper introduces an adaptive meta-imitation learning-based recommendation environment simulator, termed AMIL-Simulator. Specifically, we construct a conditionally guided diffusion model to simulate shipowner behavior in a dynamically changing environment. Moreover, we propose a shipowner reward model based on adaptive meta-imitation learning, enabling the learning of shipowner rewards across multiple tasks, even when confronted with limited samples and imbalanced categories. By conducting extensive quantitative experimental evaluations and shipowner-cargo matching studies, the results demonstrate the effectiveness of AMIL-Simulator, particularly in smaller-scale and cold-start environments.

中文翻译：

一种基于自适应元模仿学习的推荐环境模拟器——以船货匹配为例

高质量的航运是内河流域可持续城市提高运输效率和降低能源消耗的有效途径之一。目前，航运面临的最大挑战是高空船率，由于冷启动问题，无法直接应用机器学习方法。尽管一些研究人员试图利用基于深度强化学习（DRL）的建议来缓解冷启动问题，但由于缺乏可用的训练环境，进展缓慢。因此，本文介绍了一种基于自适应元模仿学习的推荐环境模拟器，称为 AMIL-Simulator。具体来说，我们构建了一个条件引导的扩散模型，以模拟船东在动态变化的环境中的行为。此外，我们提出了一种基于自适应元模仿学习的船东奖励模型，即使在面对有限的样本和不平衡的类别时，也能跨多个任务学习船东奖励。通过进行广泛的定量实验评估和船东与货物匹配研究，结果证明了 AMIL-Simulator 的有效性，尤其是在较小规模和冷启动环境中。

更新日期：2024-10-18

点击分享查看原文

点击收藏

阅读更多本刊新发论文本刊介绍/投稿指南