UGV Path Programming Based on the DQN With Noise in the Output Layer北大核心CSCD UGV Path Programming Based on the DQN With Noise in the Output Layer期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

UGV Path Programming Based on the DQN With Noise in the Output Layer北大核心CSCD

引用本文：	李杨,闫冬梅,刘磊. UGV Path Programming Based on the DQN With Noise in the Output Layer北大核心CSCD[J]. 应用数学和力学, 2023, 44(4): 450-460. DOI: 10.21656/1000-0887.430070

作者姓名：	李杨闫冬梅刘磊

作者单位：	1.河海大学理学院，南京 211100

基金项目：	国家自然科学基金(面上项目)61773152

摘要：	在DQN算法的框架下,研究了无人车路径规划问题.为提高探索效率,将处理连续状态的DQN算法加以变化地应用到离散状态,同时为平衡探索与利用,选择仅在DQN网络输出层添加噪声,并设计了渐进式奖励函数,最后在Gazebo仿真环境中进行实验.仿真结果表明:①该策略能快速规划出从初始点到目标点的无碰撞路线,与Q-learning算法、DQN算法和noisynet_DQN算法相比,该文提出的算法收敛速度更快;②该策略关于初始点、目标点、障碍物具有泛化能力,验证了其有效性与鲁棒性.
关键词：	深度强化学习无人车 DQN算法 Gauss噪声路径规划 Gazebo仿真
收稿时间：	2022-03-07
UGV Path Programming Based on the DQN With Noise in the Output Layer

Li Y.,Yan D.,Liu L.. UGV Path Programming Based on the DQN With Noise in the Output Layer[J]. Applied Mathematics and Mechanics, 2023, 44(4): 450-460. DOI: 10.21656/1000-0887.430070

Authors:	Li Y. Yan D. Liu L.

Affiliation:	1.College of Science, Hohai University, Nanjing 211100, P.R.China2.School of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 211100, P.R.China

Abstract:	The path programming of the unmanned ground vehicle (UGV) was studied under the framework of the deep Q-network (DQN) algorithm. To improve the exploration efficiency, the DQN algorithm was applied through discretization of the continuous state into the discrete state. To balance between exploration and exploitation, the Gaussian noise was added only in the output layer of the network, and a progressive reward function was designed. Finally, experiments were carried out in the Gazebo simulation environment. The simulation results show that, first, this strategy can quickly program a collision-free route from the initial point to the target point, and the convergence speed is significantly higher than those of the Q-learning algorithm, the DQN algorithm and the noisynet_DQN algorithm; second, this strategy has the generalization ability about the initial point, the target point and the obstacles, as well as verified effectiveness and robustness.

Keywords:	deep reinforcement learning DQN algorithm Gaussian noise Gazebo simulation path programming UGV
本文献已被维普等数据库收录！
	点击此处可从《应用数学和力学》浏览原始摘要信息
	点击此处可从《应用数学和力学》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏