ON THE PROPERTIES OFε(≥0) OPTIMAL POLICIES IN DISCOUNTED UNBOUNDED RETURN MODEL |
| |
引用本文: | 董泽清,张昇.ON THE PROPERTIES OFε(≥0) OPTIMAL POLICIES IN DISCOUNTED UNBOUNDED RETURN MODEL[J].应用数学学报(英文版),1987(1). |
| |
作者姓名: | 董泽清 张昇 |
| |
作者单位: | Institute of Applied Mathematics,Academia Sinica,Yunnan University |
| |
基金项目: | Project supported by the Science Fund of the Chinese Academy of Sciences |
| |
摘 要: | This paper investigates the properties of ε(≥0) optimal policies in the model of 2].It is shownthat,if π~*=(π_0~*,π_1~*,…,π_n~*,π_(n+1)~*,…)is a β-discounted optimal policy,then(π_0~*,π_1~*,…,π_n~*)~∞ for alln≥0 is also a β-discounted optimal policy.Under some condition we prove that stochastic stationarypolicy π_n~(*∞)corresponding to the decision rule π_n~* is also optimal for the same discounting factor β.Wehave also shown that for each β-optimal stochastic stationary policy π_0~(*∞),π_0~(*∞) can be decomposed intoseveral decision rules to which the corresponding stationary policies are also β-optimal separately;and conversely,a proper convex combination of these decision rules is identified with the former π_0~*.We have further proved that for any (ε,β)-optimal policy,say π~*=(π_0~*,π_1~*,…,π_n~*,π_(n+1)~*,…),(π_0~*,π_1~*,…,π_(n-1)~*)∞ is ((1-β~n)~(-1)ε,β)optimal for n>0.At the end of this paper we mention that the resultsabout convex combinations and de
|
本文献已被 CNKI 等数据库收录! |
|