The value iteration algorithm is not strongly polynomial for discounted dynamic programming期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

The value iteration algorithm is not strongly polynomial for discounted dynamic programming

Authors:	Eugene A Feinberg Jefferson Huang

Institution:	Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794-3600, USA

Abstract:	This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iterations, the value iteration algorithm is not strongly polynomial for discounted dynamic programming.

Keywords:	Markov decision process Value iteration Strongly polynomial Policy Algorithm
本文献已被 ScienceDirect 等数据库收录！