An envelope theorem and some applications to discounted Markov decision processes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An envelope theorem and some applications to discounted Markov decision processes

Authors:	Hugo Cruz-Suárez Raúl Montes-de-Oca

Affiliation:	(1) Benemérita Universidad Autónoma de Puebla, Ave. San Claudio y Río Verde, Col. San Manuel, CU, Puebla, Pue., 72570, Mexico;(2) Departamento de Matemáticas, Universidad Autónoma Metropolitana-Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, Mexico, D.F., 09340, Mexico

Abstract:	In this paper, an Envelope Theorem (ET) will be established for optimization problems on Euclidean spaces. In general, the Envelope Theorems permit analyzing an optimization problem and giving the solution by means of differentiability techniques. The ET will be presented in two versions. One of them uses concavity assumptions, whereas the other one does not require such kind of assumptions. Thereafter, the ET established will be applied to the Markov Decision Processes (MDPs) on Euclidean spaces, discounted and with infinite horizon. As the first application, several examples (including some economic models) of discounted MDPs for which the et allows to determine the value iteration functions will be presented. This will permit to obtain the corresponding optimal value functions and the optimal policies. As the second application of the ET, it will be proved that under differentiability conditions in the transition law, in the reward function, and the noise of the system, the value function and the optimal policy of the problem are differentiable with respect to the state of the system. Besides, various examples to illustrate these differentiability conditions will be provided. This work was partially supported by Benemérita Universidad Aut ónoma de Puebla (BUAP) under grant VIEP-BUAP 38/EXC/06-G, by Consejo Nacional de Ciencia y Tecnología (CONACYT), and by Evaluation-orientation de la COopération Scientifique (ECOS) under grant CONACyT-ECOS M06-M01.

Keywords:	Envelope theorem Discounted Markov decision process Differentiability of the optimal value function Differentiability of the optimal policy Economic growth model
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏