登录    注册    忘记密码

详细信息

融合多尺度和注意力机制的小样本目标检测    

Few-shot object detection via fusing multi-scale and attention mechanism

文献类型:期刊文献

中文题名:融合多尺度和注意力机制的小样本目标检测

英文题名:Few-shot object detection via fusing multi-scale and attention mechanism

作者:李鸿天[1];史鑫昊[1];潘卫国[1];徐成[1];徐冰心[1];袁家政[1,2]

第一作者:李鸿天

机构:[1]北京市信息服务工程重点实验室(北京联合大学),北京100101;[2]北京开放大学科技学院,北京100081

第一机构:北京联合大学北京市信息服务工程重点实验室

年份:2024

卷号:44

期号:5

起止页码:1437-1444

中文期刊名:计算机应用

外文期刊名:journal of Computer Applications

收录:CSTPCD;;北大核心:【北大核心2023】;CSCD:【CSCD_E2023_2024】;

基金:北京市自然科学基金资助项目(4232026);国家自然科学基金资助项目(62171042,62272049,61932012,61871039,62102033,62006020);北京市重点科技项目(KZ202211417048);北京市属高等学校高水平科研创新团队项目(BPHR20220120);北京市朝阳区协同创新中心资助项目(CYX2203);北京联合大学科研项目(ZK10202202,BPHR2020DZ02,ZK40202101,ZK120202104)。

语种:中文

中文关键词:迁移学习;小样本目标检测;注意力机制;多尺度特征融合;余弦相似度

外文关键词:transfer learning;few-shot object detection;attention mechanism;multi-scale feature fusion;cosine similarity

摘要:现有基于微调的二阶段小样本目标检测方法对新类特征不敏感,易将新类别误判成与它相似度高的基类,影响模型的检测性能。针对上述问题,提出一种融合多尺度和注意力机制的小样本目标检测(MA-FSOD)算法。首先在骨干网络使用分组卷积和大卷积核提取更具类别区分性的特征,并加入卷积注意力模块(CBAM)实现特征的自适应增强;再通过改进的金字塔网络实现多尺度的特征融合,使候选框生成网络(RPN)可以准确找到感兴趣区域(RoI),从多个尺度向分类头提供更丰富的高质量正样本;最后在微调阶段采用余弦分类头进行分类,降低类内方差。在PASCAL-VOC 2007/2012数据集上与基于候选框编码对比损失的小样本目标检测(FSCE)算法相比,MA-FSOD算法对新类的AP_(50)提升了5.6个百分点;在更具挑战性的MSCOCO数据集中,与Meta-Faster-RCNN相比,10-shot和30-shot对应的AP则分别提升了0.1个百分点和1.6个百分点。实验结果表明,相较于一些主流的小样本目标检测算法,MA-FSOD算法能更有效地缓解误分类问题,实现更高精度的小样本目标检测。
The existing two-stage few-shot object detection methods based on fine-tuning are not sensitive to the features of new classes,which will cause misjudgment of new classes into base classes with high similarity to them,thus affecting the detection performance of the model.To address the above issue,a few-shot object detection algorithm that incorporates multiscale and attention mechanism was proposed,namely MA-FSOD(Few-Shot Object Detection via fusing Multi-scale and Attention mechanism).Firstly,grouped convolutions and large convolution kernels were used to extract more classdiscriminative features in the backbone network,and Convolutional Block Attention Module(CBAM)was added to achieve adaptive feature augmentation.Then,a modified pyramid network was used to achieve multi-scale feature fusion,which enables Region Proposal Network(RPN)to accurately find Regions of Interest(RoI)and provide more abundant highquality positive samples from multiple scales to the classification head.Finally,the cosine classification head was used for classification in the fine-tuning stage to reduce the intra-class variance.Compared with the Few-Shot object detection via Contrastive proposal Encoding(FSCE)algorithm on PASCAL-VOC 2007/2012 dataset,the MA-FSOD algorithm improved AP_(50) for new classes by 5.6 percentage points;and on the more challenging MSCOCO dataset,compared with Meta-Faster-RCNN,the APs corresponding to 10-shot and 30-shot were improved by 0.1 percentage points and 1.6 percentage points,respectively.Experimental results show that MA-FSOD can more effectively alleviate the misclassification problem and achieve higher accuracy in few-shot object detection than some mainstream few-shot object detection algorithms.

参考文献:

正在载入数据...

版权所有©北京联合大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心