首页 | 本学科首页   官方微博 | 高级检索  
     检索      


The variational calculus and approximation in policy space for Markovian decision processes
Authors:Paul J Schweitzer
Institution:The Graduate School of Management, The University of Rochester, Rochester, New York 14627 U.S.A.
Abstract:The functional equations of Markovian decision processes yield the state values (and gain rate in the undiscounted case). Variational expressions are exhibited here for these state values (and gain rate); these expressions are stationary when evaluated at the correct values. When guesses for the values (and gain rate) are inserted into these variational expressions, a superior guess is usually obtained. Repetition of this procedure is shown to be equivalent to the method of successive approximations in policy space. Two other unusual features of this procedure are these: when the linear equations determining the Lagrange multipliers are non-singular, the variational expressions for the state variables are precisely one Newton-Raphson iteration; when applied to a linear objective function and piecewise-linear constraints, which arises for the functional equations of Markovian decision processes, the variational test quantity is piecewise constant, i.e., its first variation and higher variations all vanish. The latter explains its good performance (one-step convergence) if good estimates are available.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号