首页 | 本学科首页   官方微博 | 高级检索  
     


On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
Authors:Emmanuel Fernández-Gaucherand  Aristotle Arapostathis  Steven I. Marcus
Affiliation:(1) Department of Electrical and Computer Engineering, The University of Texas at Austin, 78712-1084 Austin, Texas, USA
Abstract:We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with anuncountable state space, namely the space of probability distributions on the original core state space. By developing a suitable theoretical framework, it is shown that some characteristics induced in the original problem due to the countability of the spaces involved are reflected onto the equivalent problem. Sufficient conditions are then derived for solutions to the average cost optimality equation to exist. We illustrate these results in the context of machine replacement problems. Structural properties for average cost optimal policies are obtained for a two state replacement problem; these are similar to results available for discount optimal policies. The set of assumptions used compares favorably to others currently available.This research was supported in part by the Advanced Technology Program of the State of Texas, in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860, and in part by the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044.
Keywords:Optimal control  Markov chains  partial observability  average cost  optimality equation  structured optimal policies
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号