当前位置:
X-MOL 学术
›
Robot. Comput.-Integr. Manuf.
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
A parallel graph network for generating 7-DoF model-free grasps in unstructured scenes using point cloud
Robotics and Computer-Integrated Manufacturing ( IF 9.1 ) Pub Date : 2024-09-17 , DOI: 10.1016/j.rcim.2024.102879 Chungang Zhuang, Haowen Wang, Wanhao Niu, Han Ding
Robotics and Computer-Integrated Manufacturing ( IF 9.1 ) Pub Date : 2024-09-17 , DOI: 10.1016/j.rcim.2024.102879 Chungang Zhuang, Haowen Wang, Wanhao Niu, Han Ding
Generating model-free grasps in complex scattered scenes remains a challenging task. Most current methods adopt PointNet++ as the backbone to extract structural features, while the relative associations of geometry are underexplored, leading to non-optimal grasp prediction results. In this work, a parallelized graph-based pipeline is developed to solve the 7-DoF grasp pose generation problem with point cloud as input. Using the non-textured information of the grasping scene, the proposed pipeline simultaneously performs feature embedding and grasping location focusing in two branches, avoiding the mutual influence of the two learning processes. In the feature learning branch, the geometric features of the whole scene will be fully learned. In the location focusing branch, the high-value grasping locations on the surface of objects will be strategically selected. Using the learned graph features at these locations, the pipeline will eventually output refined grasping directions and widths in conjunction with local spatial features. To strengthen the positional features in the grasping problem, a graph convolution operator based on the positional attention mechanism is designed, and a graph residual network based on this operator is applied in two branches. The above pipeline abstracts the grasping location selection task from the main process of grasp generation, which lowers the learning difficulty while avoiding the performance degradation problem of deep graph networks. The established pipeline is evaluated on the GraspNet-1Billion dataset, demonstrating much better performance and stronger generalization capabilities than the benchmark approach. In robotic bin-picking experiments, the proposed method can effectively understand scattered grasping scenarios and grasp multiple types of unknown objects with a high success rate.
中文翻译:
使用点云在非结构化场景中生成 7-DoF 无模型抓取的并行图形网络
在复杂的分散场景中生成无模型抓取仍然是一项具有挑战性的任务。目前大多数方法采用 PointNet++ 作为主干来提取结构特征,而几何的相对关联性未得到充分探索,导致无法获得最优的抓取预测结果。在这项工作中,开发了一种基于并行图的管道,以解决以点云为输入的 7-DoF 抓取姿势生成问题。利用抓取场景的非纹理信息,所提出的管道在两个分支中同时进行特征嵌入和抓取定位聚焦,避免了两个学习过程的相互影响。在特征学习分支中,将完全学习整个场景的几何特征。在 location focusing 分支中,将战略性地选择对象表面的高值抓取位置。使用这些位置学习的图形特征,管道最终将结合局部空间特征输出精细的抓取方向和宽度。为了强化抓取问题中的位置特征,设计了一种基于位置注意力机制的图卷积算子,并在两个分支中应用了基于该算子的图残差网络。上述 pipeline 将抓取位置选择任务从抓取生成的主要流程中抽象出来,在降低学习难度的同时避免了深度图网络的性能下降问题。在 GraspNet-1Billion 数据集上评估了已建立的管道,表现出比基准方法更好的性能和更强的泛化能力。 在机器人 bin-picking 实验中,所提方法能够有效理解分散抓取场景,抓取多种类型的未知物体,成功率高。
更新日期:2024-09-17
中文翻译:
使用点云在非结构化场景中生成 7-DoF 无模型抓取的并行图形网络
在复杂的分散场景中生成无模型抓取仍然是一项具有挑战性的任务。目前大多数方法采用 PointNet++ 作为主干来提取结构特征,而几何的相对关联性未得到充分探索,导致无法获得最优的抓取预测结果。在这项工作中,开发了一种基于并行图的管道,以解决以点云为输入的 7-DoF 抓取姿势生成问题。利用抓取场景的非纹理信息,所提出的管道在两个分支中同时进行特征嵌入和抓取定位聚焦,避免了两个学习过程的相互影响。在特征学习分支中,将完全学习整个场景的几何特征。在 location focusing 分支中,将战略性地选择对象表面的高值抓取位置。使用这些位置学习的图形特征,管道最终将结合局部空间特征输出精细的抓取方向和宽度。为了强化抓取问题中的位置特征,设计了一种基于位置注意力机制的图卷积算子,并在两个分支中应用了基于该算子的图残差网络。上述 pipeline 将抓取位置选择任务从抓取生成的主要流程中抽象出来,在降低学习难度的同时避免了深度图网络的性能下降问题。在 GraspNet-1Billion 数据集上评估了已建立的管道,表现出比基准方法更好的性能和更强的泛化能力。 在机器人 bin-picking 实验中,所提方法能够有效理解分散抓取场景,抓取多种类型的未知物体,成功率高。