On maximizing the average time at a goal |
| |
Authors: | S. Demko T.P. Hill |
| |
Affiliation: | School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332, USA |
| |
Abstract: | In a decision process (gambling or dynamic programming problem) with finite state space and arbitrary decision sets (gambles or actions), there is always available a Markov strategy which uniformly (nearly) maximizes the average time spent at a goal. If the decision sets are closed, there is even a stationary strategy with the same property.Examples are given to show that approximations by discounted or finite horizon payoffs are not useful for the general average reward problem. |
| |
Keywords: | gambling theory goal problems dynamic programming stationary strategy Markov strategy average reward criterion |
本文献已被 ScienceDirect 等数据库收录! |
|