首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Time aggregated Markov decision processes via standard dynamic programming
Authors:Edilson F Arruda
Institution:
  • a School of Engineering, Pontifical Catholic University of Rio Grande do Sul, Brazil
  • b Center for Systems and Control-CSC, National Laboratory for Scientific Computation-LNCC. Av. Getúlio Vargas, 333. Petrópolis, RJ 25651-075, Brazil
  • Abstract:This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms.
    Keywords:Markov decision processes  Time aggregation  Dynamic programming
    本文献已被 ScienceDirect 等数据库收录!
    设为首页 | 免责声明 | 关于勤云 | 加入收藏

    Copyright©北京勤云科技发展有限公司  京ICP备09084417号