首页 | 本学科首页   官方微博 | 高级检索  
     


Single Sample Path-Based Optimization of Markov Chains
Authors:Cao  X. R.
Affiliation:(1) Hong Kong University Grant Council under Grant, HKUST 690/95E;(2) Professor, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
Abstract:Motivated by the needs of on-line optimization of real-world engineering systems, we studied single sample path-based algorithms for Markov decision problems (MDP). The sample path used in the algorithms can be obtained by observing the operation of a real system. We give a simple example to explain the advantages of the sample path-based approach over the traditional computation-based approach: matrix inversion is not required; some transition probabilities do not have to be known; it may save storage space; and it gives the flexibility of iterating the actions for a subset of the state space in each iteration. The effect of the estimation errors and the convergence property of the sample path-based approach are studied. Finally, we propose a fast algorithm, which updates the policy whenever the system reaches a particular set of states and prove that the algorithm converges to the true optimal policy with probability one under some conditions. The sample path-based approach may have important applications to the design and management of engineering systems, such as high speed communication networks.This work was supported in part by
Keywords:Perturbation analysis  on-line optimization  Markov decision processes  performance potentials
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号