详细信息
文献类型:期刊文献
中文题名:用于6D姿态估计的轻量级全流双向融合网络
英文题名:Lightweight Full-Flow Bidirectional Fusion Network for 6D Pose Estimation
作者:林浩田[1];李永昌[1];江静[1];秦广军[1]
第一作者:林浩田
机构:[1]北京联合大学智慧城市学院,北京100101
第一机构:北京联合大学智慧城市学院
年份:2024
卷号:60
期号:22
起止页码:282-291
中文期刊名:计算机工程与应用
外文期刊名:Computer Engineering and Applications
收录:CSTPCD;;北大核心:【北大核心2023】;CSCD:【CSCD_E2023_2024】;
基金:北京联合大学科研项目(ZKZD202301)。
语种:中文
中文关键词:RGBD;姿态估计;轻量化;FasterNet;PoolFormer
外文关键词:RGBD;pose estimation;lightweight;FasterNet;PoolFormer
摘要:六自由度(six degrees of freedom,6D)姿态估计是机器人抓取与操作、增强现实、自动驾驶等应用中的关键步骤。常规的6D姿态估计方法更多地侧重于设计复杂的网络来提高估计效果,而忽略了由于模型复杂度过高和参数数量庞大导致的实际部署困难问题。以FFB6D为基线,尝试设计了一个轻量级全流双向融合网络(lightweight full-flow bidirectional fusion network,LFFB6D),一种基于RGBD的轻量级6D姿态估计方法。该方法由卷积神经网络(convolutional neural network,CNN)与点云网络(point cloud network,PCN)两个并行的编码-解码网络组成。具体来说在CNN部分,引入FasterNet来代替3×3卷积。通过更换CNN的编码网络,提出了一个上采样模块FUPB(faster upsample block),以减少网络参数。在PCN部分,引入PoolFormer来处理和聚合点云特征。提出了一个新的池化模块PFPB(PoolFormer pooling block),以提高网络的性能。实验表明,LFFB6D的参数量相较FFB6D减少了46%。在仅使用1/13的LineMOD训练集和1/9的YCB-Video训练集的情况下,LFFB6D的6D姿态估计结果超越了PoseCNN、DenseFusion等方法,达到了与PVN3D和FFB6D相近的结果。
Six degrees of freedom(6D)pose estimation is a key step in applications such as robot grasping and manipula-tion,augmented reality,and autonomous driving.Conventional 6D pose estimation methods focus more on designing com-plex networks to improve the estimation effect,while ignoring the practical deployment difficulties due to the high com-plexity of the model and the large number of parameters.Based on FFB6D,this paper attempts to design a lightweight full-flow bidirectional fusion network(LFFB6D),a lightweight 6D pose estimation method based on RGBD.The method consists of two parallel encoder-decoder networks,convolutional neural network(CNN)and point cloud network(PCN).Specifically in the CNN part,this method introduces FasterNet to replace 3×3 convolution.By replacing the encoding net-work of CNN and proposing an upsampling module FUPB(faster upsample block)to reduce network parameters.In the PCN part,this method introduces PoolFormer to process and aggregate point cloud features.A new pooling module PFPB(PoolFormer pooling block)is proposed to improve the performance of the network.Experiments show that the parameter quantity of LFFB6D is reduced by 46%compared with FFB6D.When only 1/13 of the LineMOD training set and 1/9 of the YCB-Video training set are used,the 6D pose estimation results of LFFB6D surpass PoseCNN,DenseFusion and other methods,and achieve similar results to PVN3D and FFB6D.
参考文献:
正在载入数据...