Adaptive control of discounted Markov decision chains期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Adaptive control of discounted Markov decision chains

Authors:	O Hernández-Lerma S I Marcus

Institution:	(1) Departmento de Matemáticas, Centro de Investigación del IPN, Mexico City, DF, Mexico;(2) Department of Electrical Engineering, University of Texas at Austin, Austin, Texas

Abstract:	In this paper, we consider discounted-reward finite-state Markov decision processes which depend on unknown parameters. An adaptive policy inspired by the nonstationary value iteration scheme of Federgruen and Schweitzer (Ref. 1) is proposed. This policy is briefly compared with the principle of estimation and control recently obtained by Schäl (Ref. 4).This research was supported in part by the Consejo Nacional de Ciencia y Tecnología under Grant No. PCCBBNA-005008, in part by a grant from the IBM Corporation, in part by the Air Force Office of Scientific Research under Grant No. AFOSR-79-0025, in part by the National Science Foundation under Grant No. ECS-0822033, and in part by the Joint Services Electronics Program under Contract No. F49620-77-C-0101.

Keywords:	Discounted Markov decision processes with unknown parameters nonstationary value iteration parameter estimation adaptive control naï ve feedback controller
本文献已被 SpringerLink 等数据库收录！