登录    注册    忘记密码

详细信息

基于MASAC的无人机集群对抗博弈方法    

MASAC-based confrontation game method of UAV clusters

文献类型:期刊文献

中文题名:基于MASAC的无人机集群对抗博弈方法

英文题名:MASAC-based confrontation game method of UAV clusters

作者:王尔申[1];刘帆[1];宏晨[2];郭靖[1];何宁[3];赵琳[4];薛健[4]

第一作者:王尔申

机构:[1]沈阳航空航天大学电子信息工程学院,沈阳110136;[2]北京联合大学机器人学院,北京100101;[3]北京联合大学智慧城市学院,北京100101;[4]中国科学院大学工程科学学院,北京100049

第一机构:沈阳航空航天大学电子信息工程学院,沈阳110136

年份:2022

卷号:52

期号:12

起止页码:2254-2269

中文期刊名:中国科学:信息科学

外文期刊名:Scientia Sinica(Informationis)

收录:CSTPCD;;Scopus;北大核心:【北大核心2020】;CSCD:【CSCD2021_2022】;

基金:国家重点研发计划(批准号:2018AAA0100804)资助项目。

语种:中文

中文关键词:深度强化学习;多智能体;对抗博弈;MASAC;无人机集群

外文关键词:deep reinforcement learning;multi-agent;confrontation game;MASAC;UAV clusters

摘要:随着无人机智能化水平的提高和集群控制技术的发展,无人机集群对抗智能决策方法将成为未来无人机作战的关键技术.无人机集群对抗学习环境具有维度高、非线性、信息有缺失、动作空间连续等复杂特点.近年来,以深度学习和强化学习为代表的人工智能技术取得了很大突破,深度强化学习在解决复杂环境下智能决策问题方面展现出了不俗能力.本文受多智能体集中式训练–分布式执行框架和最大化策略熵思想的启发,提出一种基于非完全信息的多智能体柔性行动器–评判器(multi-agent soft actor-critic,MASAC)深度强化学习方法,建立基于多智能体深度强化学习的无人机集群对抗博弈模型,构建连续空间多无人机作战环境,对红蓝双方无人机集群的非对称性对抗进行仿真实验,实验结果表明MASAC优于现有流行的多智能体深度强化学习方法,能使博弈双方收敛到收益更高的博弈均衡点.进一步对MASAC的收敛情况进行实验和分析,结果显示MASAC具有良好的收敛性和稳定性,能够保证MASAC在无人机集群对抗智能决策方面的实用性.
With the improvement in the intelligence level of UAVs and the development of cluster control technology,the intelligent decision-making method of UAV cluster confrontation will become the key technology of UAV combat in the future.The UAV cluster confrontation learning environment is of complex characteristics of high dimension,non-linearity,incomplete information,continuous action space,and so on.Recently,artificial intelligence technology represented by deep learning and reinforcement learning has made a great breakthrough.Deep reinforcement learning has shown a great ability to solve intelligent decision-making problems in complex environments.In this paper,enlightened by the multi-agent centralized training distributed execution framework and the idea of maximum policy entropy,we propose a deep reinforcement learning method of the multi-agent soft actor-critic(MASAC)with incomplete information.A game model for UAV cluster confrontation based on multi-agent deep reinforcement learning is established,and a continuous space multiple UAV combat environment is constructed.Simulation experiments are performed on the asymmetric confrontation of UAV clusters with red and blue teams.The experimental results show that MASAC outperforms the existing popular multi-agent deep reinforcement learning methods,making the game players converge to a higher return equilibrium point of the game.Moreover,the convergence of MASAC is investigated and analyzed extensively,and the results show that MASAC is of good convergence and stability,ensuring the practicability of MASAC in intelligent decision-making regarding UAV clusterconfrontation.

参考文献:

正在载入数据...

版权所有©北京联合大学 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心