详细信息
BFF R-CNN: Balanced Feature Fusion for Object Detection ( SCI-EXPANDED收录 EI收录)
文献类型:期刊文献
英文题名:BFF R-CNN: Balanced Feature Fusion for Object Detection
作者:Liu, Hongzhe[1];Wang, Ningwei[1];Li, Xuewei[1];Xu, Cheng[1];Li, Yaze[1]
第一作者:刘宏哲
通讯作者:Li, XW[1]
机构:[1]Beijing Union Univ, Inst Brain & Cognit Sci, Coll Robot, Beijing Key Lab Informat Serv Engn, 97 Beisihuan East Rd, Beijing 100101, Peoples R China
第一机构:北京联合大学机器人学院|北京联合大学北京市信息服务工程重点实验室
通讯机构:[1]corresponding author), Beijing Union Univ, Inst Brain & Cognit Sci, Coll Robot, Beijing Key Lab Informat Serv Engn, 97 Beisihuan East Rd, Beijing 100101, Peoples R China.|[1141739]北京联合大学机器人学院;[11417]北京联合大学;[11417103]北京联合大学北京市信息服务工程重点实验室;
年份:2022
卷号:E105D
期号:8
起止页码:1472-1480
外文期刊名:IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
收录:;EI(收录号:20223712715903);Scopus(收录号:2-s2.0-85137401143);WOS:【SCI-EXPANDED(收录号:WOS:000836845200011)】;
基金:This work was supported, the National Natural Science Foundation of China (Grant No. 61871039, 62171042, 62102033, 62006020, 61906017), the Beijing Municipal Commission of Education Project (No.KM202111417001, KM201911417001), the Collaborative Innovation Center for Visual Intelligence (Grant No. CYXC2011), the Academic Research Projects of Beijing Union University(No.BPHR2020DZ02, ZB10202003, ZK40202101, ZK120202104), the Beijing Union University Graduate Research and Innovation Funding Project (YZ2020K001), the Beijing Key Science and Technology Project (No. KZ202211417048, KM202111417001, KM201911417001).
语种:英文
外文关键词:deep learning; neural network; object detection; feature fusion
摘要:In the neck part of a two-stage object detection network, feature fusion is generally carried out in either a top-down or bottom-up manner. However, two types of imbalance may exist: feature imbalance in the neck of the model and gradient imbalance in the region of interest extraction layer due to the scale changes of objects. The deeper the network is, the more abstract the learned features are, that is to say, more semantic information can be extracted. However, the extracted image background, spatial location, and other resolution information are less. In contrast, the shallow part can learn little semantic information, but a lot of spatial location information. We propose the Both Ends to Centre to Multiple Layers (BEtM) feature fusion method to solve the feature imbalance problem in the neck and a Multi-level Region of Interest Feature Extraction (MRoIE) layer to solve the gradient imbalance problem. In combination with the Region-based Convolutional Neural Network (R-CNN) framework, our Balanced Feature Fusion (BFF) method offers significantly improved network performance compared with the Faster R-CNN architecture. On the MS COCO 2017 dataset, it achieves an average precision (AP) that is 1.9 points and 3.2 points higher than those of the Feature Pyramid Network (FPN) Faster R-CNN framework and the Generic Region of Interest Extractor (GRoIE) framework, respectively.
参考文献:
正在载入数据...