首页 | 本学科首页   官方微博 | 高级检索  
     


Successive approximations in partially observable controlled Markov chains with risk-sensitive average criterion
Abstract:Partially observable Markov decision chains with finite state, action and signal spaces are considered. The performance index is the risk-sensitive average criterion and, under conditions concerning reachability between the unobservable states and observability of the signals, it is shown that the value iteration algorithm can be implemented to approximate the optimal average cost, to determine a stationary policy whose performance index is arbitrarily close to the optimal one, and to establish the existence of solutions to the optimality equation. The results rely on an appropriate extension of the well-known Schweitzer's transformation.
Keywords:Reduction to a completely observable model  Schweitzer's transformation  Equicontinuity of the value iteration functions  Birkhoff's distance  Lipschitz norm
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号