A variance minimization problem for a Markov decision process期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A variance minimization problem for a Markov decision process

Institution:	1. Metro North Hospital and Health Service, Queensland Health, Herston QLD 4029, Australia;2. Menzies Health Institute Queensland and Centre for Applied Health Economics, Griffith University, Nathan campus, 170 Kessels Road, Nathan QLD 4111, Australia;3. Centre for Allied Health Research, Royal Brisbane and Women’s Hospital, Herston QLD 4029, Australia;4. Anthrodynamics Simulation Services. Minus Fifty Software, Canada;5. School of Health and Rehabilitation Sciences, The University of Queensland, Therapies Building 84a, St Lucia QLD 4072, Australia;6. Physiotherapy Department, Royal Brisbane and Women’s Hospital, Herston QLD 4029, Australia;1. Accounting and Finance Division, Stirling Management School, University of Stirling, UK;2. Business School, University of Dundee, UK;1. Departments of Surgery and Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, State University of New York, Buffalo, New York;2. Department of Surgery, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania;3. Department of Otolaryngology, Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee;4. Department of Surgery and Institute for Health Informatics, University of Minnesota, Minneapolis, Minnesota;5. Division of Endocrine Surgery, University of Wisconsin School of Medicine, Madison, Wisconsin;6. IBM Watson Health, Cambridge, Massachusetts;7. Departments of Pediatric Surgery, Pediatrics, and Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee

Abstract:	In the steady state of a discrete time Markov decision process, we consider the problem to find an optimal randomized policy that minimizes the variance of the reward in a transition among the policies which give the mean not less than a specified value. The problem is solved by introducing a parametric Markov decision process with average cost criterion. It is shown that there exists an optimal policy which is a mixture of at most two pure policies. As an application, the toymaker's problem is discussed.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏