首页 | 本学科首页   官方微博 | 高级检索  
     

UGV Path Programming Based on the DQN With Noise in the Output Layer北大核心CSCD
引用本文:李杨,闫冬梅,刘磊. UGV Path Programming Based on the DQN With Noise in the Output Layer北大核心CSCD[J]. 应用数学和力学, 2023, 44(4): 450-460. DOI: 10.21656/1000-0887.430070
作者姓名:李杨  闫冬梅  刘磊
作者单位:1.河海大学 理学院,南京 211100
基金项目:国家自然科学基金(面上项目)61773152
摘    要:在DQN算法的框架下,研究了无人车路径规划问题.为提高探索效率,将处理连续状态的DQN算法加以变化地应用到离散状态,同时为平衡探索与利用,选择仅在DQN网络输出层添加噪声,并设计了渐进式奖励函数,最后在Gazebo仿真环境中进行实验.仿真结果表明:①该策略能快速规划出从初始点到目标点的无碰撞路线,与Q-learning算法、DQN算法和noisynet_DQN算法相比,该文提出的算法收敛速度更快;②该策略关于初始点、目标点、障碍物具有泛化能力,验证了其有效性与鲁棒性.

关 键 词:深度强化学习  无人车  DQN算法  Gauss噪声  路径规划  Gazebo仿真
收稿时间:2022-03-07

UGV Path Programming Based on the DQN With Noise in the Output Layer
Li Y.,Yan D.,Liu L.. UGV Path Programming Based on the DQN With Noise in the Output Layer[J]. Applied Mathematics and Mechanics, 2023, 44(4): 450-460. DOI: 10.21656/1000-0887.430070
Authors:Li Y.  Yan D.  Liu L.
Affiliation:1.College of Science, Hohai University, Nanjing 211100, P.R.China2.School of Modern Posts, Nanjing University of Posts and Telecommunications, Nanjing 211100, P.R.China
Abstract:The path programming of the unmanned ground vehicle (UGV) was studied under the framework of the deep Q-network (DQN) algorithm. To improve the exploration efficiency, the DQN algorithm was applied through discretization of the continuous state into the discrete state. To balance between exploration and exploitation, the Gaussian noise was added only in the output layer of the network, and a progressive reward function was designed. Finally, experiments were carried out in the Gazebo simulation environment. The simulation results show that, first, this strategy can quickly program a collision-free route from the initial point to the target point, and the convergence speed is significantly higher than those of the Q-learning algorithm, the DQN algorithm and the noisynet_DQN algorithm; second, this strategy has the generalization ability about the initial point, the target point and the obstacles, as well as verified effectiveness and robustness.
Keywords:deep reinforcement learning  DQN algorithm  Gaussian noise  Gazebo simulation  path programming  UGV
本文献已被 维普 等数据库收录!
点击此处可从《应用数学和力学》浏览原始摘要信息
点击此处可从《应用数学和力学》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号