当前位置: X-MOL 学术Autom. Constr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Integrated reinforcement and imitation learning for tower crane lift path planning
Automation in Construction ( IF 9.6 ) Pub Date : 2024-06-22 , DOI: 10.1016/j.autcon.2024.105568
Zikang Wang , Chun Huang , Boqiang Yao , Xin Li

Reinforcement learning (RL) has emerged as a promising solution method for crane-lift path planning. However, designing appropriate reward functions for tower crane (TC) operations remains particularly challenging. Poor design of reward functions can lead to non-executable lifting paths. This paper presents a framework combining imitation learning (IL) and RL to address the challenge. The framework comprises three steps: (1) designing a virtual environment consisting of construction site models and a TC model, (2) collecting expert demonstrations through virtual reality (VR) and pretraining through behavioral cloning (BC), and (3) refining the BC policies via generative adversarial imitation learning (GAIL) and proximal policy optimization (PPO). Using the paths generated by a PPO model as the baseline, the proposed BC + PPO + GAIL model exhibited better performance in both blind and nonblind lifting scenarios. This framework has been proven to generate realistic lifting paths mirroring crane operator behavior while ensuring efficiency and safety.

中文翻译:


塔式起重机提升路径规划的综合强化和模仿学习



强化学习 (RL) 已成为起重机升降机路径规划的一种有前途的解决方法。然而,为塔式起重机(TC)操作设计适当的奖励函数仍然特别具有挑战性。奖励函数设计不当可能会导致不可执行的提升路径。本文提出了一个结合模仿学习(IL)和强化学习的框架来应对这一挑战。该框架包括三个步骤:(1)设计由施工现场模型和TC模型组成的虚拟环境,(2)通过虚拟现实(VR)收集专家演示并通过行为克隆(BC)进行预训练,以及(3)完善通过生成对抗性模仿学习(GAIL)和近端策略优化(PPO)的BC策略。使用PPO模型生成的路径作为基线,所提出的BC+PPO+GAIL模型在盲提升和非盲提升场景中都表现出更好的性能。该框架已被证明可以生成反映起重机操作员行为的真实提升路径,同时确保效率和安全。
更新日期:2024-06-22
down
wechat
bug