One-armed bandit process with a covariate |
| |
Authors: | You Liang Xikui Wang Yanqing Yi |
| |
Affiliation: | 1. Department of Statistics, University of Manitoba, Winnipeg, MB, R3T 2N2, Canada 2. Division of Community Health and Humanities, Faculty of Medicine, Memorial University of Newfoundland, St. John’s, NF, 1B 3V6, Canada
|
| |
Abstract: | We generalize the bandit process with a covariate introduced by Woodroofe in several significant directions: a linear regression model characterizing the unknown arm, an unknown variance for regression residuals and general discounting sequence for a non-stationary model. With the Bayesian regression approach, we assume a normal-gamma conjugate prior distribution of the unknown parameters. It is shown that the optimal strategy is determined by a sequence of index values which are monotonic and determined by the observed value of the covariate and updated posterior distributions. We further show that the myopic strategy is not optimal in general. Such structural properties help to understand the tradeoff between information gathering and immediate expected payoff and may provide certain insight for covariate adjusted response adaptive design of clinical trials. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|