On the Bernoulli two-armed bandit problem期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

On the Bernoulli two-armed bandit problem

Abstract:	The paper is initially concerned with monotonic properties of the posterior success probabilities when the prior success probabilities are distributed according to an arbitrary joint distribution function (Bayesian approach). Next a dynamic programming model is proposed and monotonic properties of the optimal expected cumulative discounted reward are proved. Finally, optimality properties are given for the case when one prior success probability is known.

Keywords: