详细信息
Intersection decision model based on state adversarial deep reinforcement learning ( SCI-EXPANDED收录 EI收录)
文献类型:期刊文献
英文题名:Intersection decision model based on state adversarial deep reinforcement learning
作者:Jiang, Anni[1];Du, Yu[2];Yuan, Ying[2];Li, Jiahong[2];Jiang, Beiyan[2]
第一作者:Jiang, Anni
通讯作者:Du, Y[1]
机构:[1]Beijing Union Univ, Coll Smart City, Beijing, Peoples R China;[2]Beijing Union Univ, Coll Robot, 97 Beisihuan East Rd, Beijing 100101, Peoples R China
第一机构:北京联合大学
通讯机构:[1]corresponding author), Beijing Union Univ, Coll Robot, 97 Beisihuan East Rd, Beijing 100101, Peoples R China.|[1141739]北京联合大学机器人学院;[11417]北京联合大学;
年份:2024
外文期刊名:PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING
收录:;EI(收录号:20245217588005);Scopus(收录号:2-s2.0-85212829210);WOS:【SCI-EXPANDED(收录号:WOS:001382351400001)】;
基金:The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (No. 52072213) and R\&D Program of Beijing Municipal Education Commission (No. KM202311417006).
语种:英文
外文关键词:Adversarial state space; deep reinforcement learning; robustness of autonomous driving; intersection scenarios
摘要:The decision-making model for autonomous vehicles based on deep reinforcement learning faces challenges in robustness in complex and dynamic scenarios, such as crossroads. This study proposes a state adversarial deep reinforcement learning decision model for autonomous driving to address this concern. The model incorporates an adversarial state space standard deviation to accommodate the significant differences in the decision model's state space between the initial and final steps. Adversarial state space standard deviation, coupled with real-time state space, creates an adversarial state space set. The decision model obtains an adversarial strategy under the adversarial state space set and integrates it with the strategy of the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. This simultaneous optimization of model parameters is designed to improve the robustness of the decision model for autonomous driving. Experimental validation was performed using the Carla simulation environment, involving the creation of four scenarios with varying traffic flow densities and collision rates. In these scenarios, the enhanced model achieved success rates ranging from 89.9% to 98.6%, with the average vehicle travel time staying below 10 s. By contrast, the TD3 and Soft Actor-Critic (SAC) models exhibited success rates between 78.4% and 93.4%, with average vehicle travel times ranging from 10.03 to 20.01 s. When tested across various scenarios, the improved model demonstrated greater robustness compared to the TD3 and SAC models.
参考文献:
正在载入数据...