详细信息
MADRL-based UAV swarm non-cooperative game under incomplete information ( SCI-EXPANDED收录 EI收录)
文献类型:期刊文献
英文题名:MADRL-based UAV swarm non-cooperative game under incomplete information
作者:Wang, Ershen[1];Liu, Fan[1];Hong, Chen[2];Guo, Jing[1];Zhao, Lin[3];Xue, Jian[3];He, Ning[4]
第一作者:Wang, Ershen
通讯作者:Hong, C[1]
机构:[1]Shenyang Aerosp Univ, Sch Elect & Informat Engn, Shenyang 110136, Peoples R China;[2]Beijing Union Univ, Coll Robot, Beijing 100101, Peoples R China;[3]Univ Chinese Acad Sci, Sch Engn Sci, Beijing 100049, Peoples R China;[4]Beijing Union Univ, Coll Smart City, Beijing 100101, Peoples R China
第一机构:Shenyang Aerosp Univ, Sch Elect & Informat Engn, Shenyang 110136, Peoples R China
通讯机构:[1]corresponding author), Beijing Union Univ, Coll Robot, Beijing 100101, Peoples R China.|[1141739]北京联合大学机器人学院;[11417]北京联合大学;
年份:2024
卷号:37
期号:6
起止页码:293-306
外文期刊名:CHINESE JOURNAL OF AERONAUTICS
收录:;EI(收录号:20242016099146);Scopus(收录号:2-s2.0-85192932294);WOS:【SCI-EXPANDED(收录号:WOS:001258305200001)】;
基金:The work was supported by the National Key R&D Program of China (No. 2018AAA0100804), the National Natural Science Foundation of China (No. 62173237), the Academic Research Projects of Beijing Union University, China (Nos. SK160202103, ZK50201911, ZK30202107, ZK30202108), the SongShan Laboratory Foundation, China (No. YYJC062022017), the Applied Basic Research Programs of Liaoning Province, China (Nos. 2022020502-JH2/1013, 2022JH2/101300150), the Special Funds program of Civil Aircraft, China (No. 01020220627066) and the Special Funds program of Shenyang Science and Technology, China (No.22-322-3-34).r -cooperative game incomplete 2022JH2/101300150) , the Special Funds program of Civil Air-craft, China (No. 01020220627066) and the Special Funds pro-gram of Shenyang Science and Technology, China (No.22-322-3-34) .
语种:英文
外文关键词:UAV swarm; Reinforcement learning; Deep learning; Multi-agent; Non-cooperative game; Nash equilibrium
摘要:Unmanned Aerial Vehicles (UAVs) play increasing important role in modern battlefield. In this paper, considering the incomplete observation information of individual UAV in complex combat environment, we put forward an UAV swarm non -cooperative game model based on Multi -Agent Deep Reinforcement Learning (MADRL), where the state space and action space are constructed to adapt the real features of UAV swarm air-to-air combat. The multi -agent particle environment is employed to generate an UAV combat scene with continuous observation space. Some recently popular MADRL methods are compared extensively in the UAV swarm noncooperative game model, the results indicate that the performance of Multi -Agent Soft ActorCritic (MASAC) is better than that of other MADRL methods by a large margin. UAV swarm employing MASAC can learn more effective policies, and obtain much higher hit rate and win rate. Simulations under different swarm sizes and UAV physical parameters are also performed, which implies that MASAC owns a well generalization effect. Furthermore, the practicability and convergence of MASAC are addressed by investigating the loss value of Q -value networks with respect to individual UAV, the results demonstrate that MASAC is of good practicability and the Nash equilibrium of the UAV swarm non -cooperative game under incomplete information can be reached. (c) 2024 Chinese Society of Aeronautics and Astronautics. Production and hosting by Elsevier Ltd. All rights reserved. This is an open access article under the CC BY -NC -ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).
参考文献:
正在载入数据...