当前位置: X-MOL 学术Comput. Ind. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies
Computers in Industry ( IF 8.2 ) Pub Date : 2024-11-04 , DOI: 10.1016/j.compind.2024.104202
Y.P. Tsang, D.Y. Mo, K.T. Chung, C.K.M. Lee

In the realm of robotic palletisation, the quest for optimal space utilization remains vital but also presents a critical challenge, particularly due to the constraints of decision complexity and the need for real-time decision-making without complete prior information. The widely adopted rule-based heuristics approaches were ease to use, but failed to adapt dynamically to the complex and changing landscape of online 3D bin packing. This study is motivated by the need for a system that is both more agile and intelligent, capable of managing the intricacies of dual-bin scenarios and the variable inflow of items. This study introduces a novel deep reinforcement learning (DRL) optimiser, employing a double deep Q-network (DDQN) to obtain optimal packing policies in an online environment with two proposed bin replacement strategies. This approach surpasses the limitations of previous methods by facilitating the simultaneous management of multiple bins and enabling on-the-fly adjustments to decisions based on limited prior knowledge. In a case study involving a logistics company, the proposed optimizer demonstrated a significant improvement in average space utilization across various lookahead scenarios, outperforming traditional heuristics in simulation experiments. The proposed optimiser contributes significantly to the economic and environmental sustainability of robotic warehouses, positioning itself as a cornerstone for the future of smart logistics.

中文翻译:


一种深度强化学习方法,用于使用料箱更换策略进行在线和并行 3D 料箱包装优化



在机器人码垛领域,追求最佳空间利用率仍然至关重要,但也带来了关键挑战,特别是由于决策复杂性的限制以及在没有完整先验信息的情况下进行实时决策的需求。广泛采用的基于规则的启发式方法易于使用,但无法动态适应复杂且不断变化的在线 3D 垃圾箱包装环境。这项研究的动机是对一个更加敏捷和智能的系统的需求,该系统能够管理双箱场景的复杂性和可变的项目流入。本研究介绍了一种新的深度强化学习 (DRL) 优化器,它采用双深度 Q 网络 (DDQN) 在在线环境中通过两种提出的 bin 替换策略来获得最佳打包策略。这种方法超越了以前方法的局限性,促进了多个 bin 的同步管理,并支持根据有限的先验知识对决策进行即时调整。在涉及一家物流公司的案例研究中,所提出的优化器展示了各种前瞻场景下平均空间利用率的显著提高,优于仿真实验中的传统启发式方法。拟议的优化器为机器人仓库的经济和环境可持续性做出了重大贡献,将自己定位为未来智能物流的基石。
更新日期:2024-11-04
down
wechat
bug