Contextual bandits learning-based branch-and-price-and-cut algorithm for the two-dimensional vector packing problem with conflicts and time windows,Transportation Research Part E: Logistics and Transportation Review

当前位置： X-MOL 学术 › Transp. Res. Part E Logist. Transp. Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Contextual bandits learning-based branch-and-price-and-cut algorithm for the two-dimensional vector packing problem with conflicts and time windows
Transportation Research Part E: Logistics and Transportation Review ( IF 8.3 ) Pub Date : 2024-11-29 , DOI: 10.1016/j.tre.2024.103866
Yanru Chen, Mujin Gao, Zongcheng Zhang, Junheng Li, M.I.M. Wahab, Yangsheng Jiang

A two-dimensional vector packing problem with conflicts and time windows (2DVPPCTW) is investigated in this study. It consists of packing items into the minimum number of bins, and items are characterized by different weights, volumes, and time windows. Items also have conflicts and cannot be packed in the same bin. We formulate the 2DVPPCTW as an integer programming model and reformulate it to the master problem and the subproblem based on the Danzig–Wolfe decomposition. An exact algorithm, contextual bandits learning-based branch-and-price-and-cut algorithm (CBL-BPC), is proposed for the 2DVPPCTW with reinforcement learning technique. In particular, we provide a CBL framework for the subproblem, which usually poses considerable computational challenges. Five heuristic algorithms, namely, adaptive large neighborhood search (ALNS), ant colony optimization heuristic (ACO), heuristic dynamic programming (DP), a combination of ALNS and heuristic DP, and a combination of ACO and heuristic DP, are developed as bandit arms in the CBL framework. The CBL framework adaptively chooses one of five heuristics algorithms to solve the subproblem by learning from previous experiences. An exact dynamic programming algorithm is invoked to guarantee optimality once the CBL fails to find a better solution to the subproblem. Rounded capacity inequalities and accelerating strategies are introduced to accelerate the solution. An extensive computational study shows that the CBL-BPC can solve all 800 instances optimally within a reasonable time frame and is highly competitive with state-of-the-art exact and heuristics methods.

中文翻译：

基于上下文老虎机学习的 branch-and-price-and-cut 算法，用于具有冲突和时间窗口的二维向量打包问题

本研究研究了具有冲突和时间窗口的二维向量打包问题（2DVPPCTW）。它包括将商品包装到最少数量的箱子中，商品具有不同的重量、体积和时间窗口。物料也存在冲突，不能装箱在同一个箱子中。我们将 2DVPPCTW 表述为整数规划模型，并根据 Danzig-Wolfe 分解将其重新表述为主问题和子问题。该文针对具有强化学习技术的 2DVPPCTW 提出了一种精确的算法，即基于上下文老虎机学习的分支和价格与切割算法（CBL-BPC）。特别是，我们为子问题提供了一个 CBL 框架，这通常会带来相当大的计算挑战。五种启发式算法，即自适应大邻域搜索（ALNS）、蚁群优化启发式（ACO）、启发式动态规划（DP）、ALNS 和启发式 DP 的组合以及 ACO 和启发式 DP 的组合，在 CBL 框架中被开发为强盗武器。CBL 框架通过学习以前的经验，自适应地选择五种启发式算法中的一种来解决子问题。一旦 CBL 无法找到子问题的更好解，将调用精确的动态规划算法来保证最优性。引入了四舍五入的能力不等式和加速策略来加速解决方案。一项广泛的计算研究表明，CBL-BPC 可以在合理的时间范围内以最佳方式求解所有 800 个实例，并且与最先进的精确和启发式方法相比具有很强的竞争力。

更新日期：2024-11-29

点击分享查看原文

点击收藏

阅读更多本刊新发论文