On the evaluation of strategies for branching bandit processes |
| |
Authors: | K. D. Glazebrook R. J. Boys N. A. Fay |
| |
Affiliation: | (1) Department of Mathematics and Statistics, University of Newcastle upon Tyne, Newcastle upon Tyne, England;(2) Department of Mathematical Sciences, University of Durham, Durham, England |
| |
Abstract: | ![]() Glazebrook [1] has given an account of improved procedures for strategy evaluation for resource allocation in a stochastic environment. These methods are extended in the paper in such a way that they can be applied to problems which, for example, have precedence constraints and/or an arrivals process of new jobs. Theoretical results, backed up by numerical studies, show that quasi-myopic heuristics often perform well. |
| |
Keywords: | Bandit process Gittins' index Markov decision process stopping time strategy evaluation |
本文献已被 SpringerLink 等数据库收录! |
|