首页 | 本学科首页   官方微博 | 高级检索  
     检索      


The value iteration algorithm is not strongly polynomial for discounted dynamic programming
Authors:Eugene A Feinberg  Jefferson Huang
Institution:Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY 11794-3600, USA
Abstract:This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iterations, the value iteration algorithm is not strongly polynomial for discounted dynamic programming.
Keywords:Markov decision process  Value iteration  Strongly polynomial  Policy  Algorithm
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号