首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Nonstationary value-iteration and adaptive control of discounted semi-Markov processes
Authors:Onésimo Hernández-Lerma
Institution:Departamento de Matemáticas, Centro de Investigation del I.P.N., Apartado Postal 14-740, 07000 Mexico, Federal District, Mexico
Abstract:We consider in this paper discounted-reward, denumerable state space, semi-Markov decision processes which depend on unknown parameters. The problems we are interested in are: Given that the true parameter value is unknown, (I) give an iterative scheme to determine the total maximal discounted reward, and (II) find an asymptotically discount optimal (adaptive) policy. Our solutions are inspired by the nonstationary value iteration (NVI) scheme of Federgruen and Schweitzer (J. Optim. Theory Appl.34 (1981), 207–241) combined with the ideas of Schäl (Preprint No. 428, Inst. Angew. Math. Univ. Bonn, 1981) concerning the “principle of estimation and control” for the adaptive control of semi-Markov processes.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号