首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Convergence of Markov decision processes with constraints and state-action dependent discount factors
Authors:Wu  Xiao  Guo  Xianping
Institution:1.School of Mathematics and Statistics, Zhaoqing University, Zhaoqing, 526061, China
;2.School of Mathematics, Sun Yat-sen University, Guangzhou, 510275, China
;
Abstract:This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs) with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the "limit" one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation.
Keywords:discrete-time Markov decision processes  state-action dependent discount factors  unbounded costs  convergence
本文献已被 CNKI 维普 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号