Risk-sensitive reinforcement learning algorithms with generalized average criterion Risk-sensitive reinforcement learning algorithms with generalized average criterion期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Risk-sensitive reinforcement learning algorithms with generalized average criterion

引用本文：	殷苌茗,王汉兴,赵飞. Risk-sensitive reinforcement learning algorithms with generalized average criterion[J]. 应用数学和力学(英文版), 2007, 28(3): 405-405. DOI: 10.1007/s,10483-007-0313-x

作者姓名：	殷苌茗王汉兴赵飞

作者单位：	College of Computer and Communicational Engineering Changsha University of Science and Technology，College of Sciences，Shanghai University，College of Sciences，Shanghai University，Changsha 410076，P.R.China College of Sciences，Shanghai University，Shanghai 200444，P.R.China，Shanghai 200444，P.R.China，Shanghai 200444，P.R.China

摘要：	A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robusticity of solutions. The robusticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or min) is applied to study a class of important learning algorithms, dynamic programming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robusticity of reinforcement learning algorithms theoretically.
关键词：	强化学习算法风险敏感度广义平均数标准收敛
收稿时间：	2006-03-21
修稿时间：	2006-12-07
Risk-sensitive reinforcement learning algorithms with generalized average criterion

Chang-ming Yin,Wang Han-xing,Zhao Fei. Risk-sensitive reinforcement learning algorithms with generalized average criterion[J]. Applied Mathematics and Mechanics(English Edition), 2007, 28(3): 405-405. DOI: 10.1007/s,10483-007-0313-x

Authors:	Chang-ming Yin Wang Han-xing Zhao Fei

Affiliation:	1. College of Computer and Communicational Engineering,Changsha University of Science and Technology,Changsha 410076,P.R.China;College of Sciences,Shanghai University,Shanghai 200444,P.R.China 2. College of Sciences,Shanghai University,Shanghai 200444,P.R.China

Abstract:	A new algorithm is proposed, which immolates the optimality of control policies potentially to obtain the robusticity of solutions. The robusticity of solutions maybe becomes a very important property for a learning system when there exists non-matching between theory models and practical physical system, or the practical system is not static, or the availability of a control action changes along with the variety of time. The main contribution is that a set of approximation algorithms and their convergence results are given. A generalized average operator instead of the general optimal operator max (or min) is applied to study a class of important learning algorithms, dynamic programming algorithms, and discuss their convergences from theoretic point of view. The purpose for this research is to improve the robusticity of reinforcement learning algorithms theoretically.

Keywords:	reinforcement learning risk-sensitive generalized average algorithm convergence
本文献已被 CNKI 维普万方数据 SpringerLink 等数据库收录！
	点击此处可从《应用数学和力学(英文版)》浏览原始摘要信息
	点击此处可从《应用数学和力学(英文版)》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏