首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Adaptive control of discounted Markov decision chains
Authors:O Hernández-Lerma  S I Marcus
Institution:(1) Departmento de Matemáticas, Centro de Investigación del IPN, Mexico City, DF, Mexico;(2) Department of Electrical Engineering, University of Texas at Austin, Austin, Texas
Abstract:In this paper, we consider discounted-reward finite-state Markov decision processes which depend on unknown parameters. An adaptive policy inspired by the nonstationary value iteration scheme of Federgruen and Schweitzer (Ref. 1) is proposed. This policy is briefly compared with the principle of estimation and control recently obtained by Schäl (Ref. 4).This research was supported in part by the Consejo Nacional de Ciencia y Tecnología under Grant No. PCCBBNA-005008, in part by a grant from the IBM Corporation, in part by the Air Force Office of Scientific Research under Grant No. AFOSR-79-0025, in part by the National Science Foundation under Grant No. ECS-0822033, and in part by the Joint Services Electronics Program under Contract No. F49620-77-C-0101.
Keywords:Discounted Markov decision processes with unknown parameters  nonstationary value iteration  parameter estimation  adaptive control  naï  ve feedback controller
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号