详细信息
改进深度Q网络的无人车换道决策算法研究
Research on Autonomous Vehicle Lane Change Strategy Algorithm Based on Improved Deep Q Network
文献类型:期刊文献
中文题名:改进深度Q网络的无人车换道决策算法研究
英文题名:Research on Autonomous Vehicle Lane Change Strategy Algorithm Based on Improved Deep Q Network
作者:张鑫辰[1];张军[2];刘元盛[2];路铭[3];谢龙洋[1]
第一作者:张鑫辰
机构:[1]北京联合大学北京市信息服务工程重点实验室,北京100101;[2]北京联合大学机器人学院,北京100101;[3]北京联合大学应用科技学院,北京100101
第一机构:北京联合大学北京市信息服务工程重点实验室
年份:2022
卷号:58
期号:7
起止页码:266-275
中文期刊名:计算机工程与应用
外文期刊名:Computer Engineering and Applications
收录:CSTPCD;;北大核心:【北大核心2020】;CSCD:【CSCD_E2021_2022】;
基金:国家自然科学基金(61871038,61871039);北京联合大学科研项目(ZK130202002);北京联合大学人才强校优选-拔尖计划(BPHR2020BZ01);北京联合大学研究生科研创新项目(YZ2020K001);低速无人驾驶关键技术研究与成果转化项目(XP202017)。
语种:中文
中文关键词:无人车;换道决策;状态价值函数;动作优势函数;优先级经验回放
外文关键词:autonomous vehicle;lane change strategy;state value function;action advantage function;prioritized experience replay
摘要:深度Q网络(deep Q network,DQN)模型已被广泛应用于高速公路场景中无人车换道决策,但传统的DQN存在过估计且收敛速度较慢的问题。针对此问题提出了基于改进深度Q网络的无人车换道决策模型。将得到的状态值分别输入到两个结构相同而参数更新频率不同的神经网络中,以此来减少经验样本之间的相关性,然后将隐藏层输出的无人车状态信息同时输入到状态价值函数(state value function)流和动作优势函数(action advantage function)流中,从而更准确地得到模型中每个动作对应的Q值,再采用优先级经验回放(prioritized experience replay,PER)的方式从经验回放单元中抽取经验样本,增加经验回放单元中重要样本的利用率。在NGSIM数据集搭建的实验场景中进行模型的训练和测试,实验结果表明,改进的深度Q网络模型可以使无人车更好地理解环境中的状态变化,提高了换道决策成功率的同时网络的收敛速度也得到提升。
The deep Q network(DQN)model has been widely used in autonomous vehicle lane change strategy in highway scenes,but the traditional DQN has the problems of overestimation and slow convergence speed.Aiming at these problems,an autonomous vehicle lane change strategy model based on improved deep Q network is proposed.Firstly,the obtained state values are input into two neural networks with the same structure but different parameter update frequencies to reduce the correlation between experience samples.Then the autonomous vehicle state information output by the hidden layer are input into the state value function and action advantage function at the same time,thus the Q value corresponding to each action in the model can be obtained more accurately.Furthermore,the prioritized experience replay(PER)method is adopted to extract experience samples from the experience playback unit to increase the utilization rate of important samples.Finally,the proposed model is trained and tested in the experimental scene built by the NGSIM dataset.The experimental results show thatthe improved deep Q network model can enable autonomous vehicles to understand the state changes in the environment better than other DQN models,and improve the success rate of lane changing strategy and the convergence speed of the network.
参考文献:
正在载入数据...