首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper is concerned with the adaptive control problem, over the infinite horizon, for partially observable Markov decision processes whose transition functions are parameterized by an unknown vector. We treat finite models and impose relatively mild assumptions on the transition function. Provided that a sequence of parameter estimates converging in probability to the true parameter value is available, we show that the certainty equivalence adaptive policy is optimal in the long-run average sense.  相似文献   

2.
本文讨论了一类非时齐部分可观察Markov决策模型.在不改变状态空间可列性的条件下,把该模型转化为[5]中的一般化折扣模型,从而解决了其最优策略问题,并且得到了该模型的有限阶段逼近算法,其中该算法涉及的状态是可列的.  相似文献   

3.
This paper addresses the problem of durable goods manufacturers in an oligopoly seeking optimal values for three decision variables: product warranty, reliability and price. Each firm seeks a warranty-reliability-price combination that maximizes expected profit subject to quite general constraints on the firm's decision variables. Warranty serves as a signal of product reliability, which is not observable by consumers. We present a game-theoretic model of warranty-reliability-price competition in such a market and examine Nash equilibria for this game. We show that under fairly general assumptions each firm can optimally set its warranty and reliability independently of price and competitors' actions. In addition, we show that optimal warranties and reliabilities are complementary, and we explore the impact of different market factors on the optimal warranty and reliability. Finally, we show that optimal warranties are longer and products more reliable when consumers are risk averse.  相似文献   

4.
We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with anuncountable state space, namely the space of probability distributions on the original core state space. By developing a suitable theoretical framework, it is shown that some characteristics induced in the original problem due to the countability of the spaces involved are reflected onto the equivalent problem. Sufficient conditions are then derived for solutions to the average cost optimality equation to exist. We illustrate these results in the context of machine replacement problems. Structural properties for average cost optimal policies are obtained for a two state replacement problem; these are similar to results available for discount optimal policies. The set of assumptions used compares favorably to others currently available.This research was supported in part by the Advanced Technology Program of the State of Texas, in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860, and in part by the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044.  相似文献   

5.
Most industrial products and processes are characterized by several, typically correlated measurable variables, which jointly describe the product or process quality. Various control charts such as Hotelling’s T2, EWMA and CUSUM charts have been developed for multivariate quality control, where the values of the chart parameters, namely the sample size, sampling interval and the control limits are determined to satisfy given economic and/or statistical requirements. It is well known that this traditional non-Bayesian approach to a control chart design is not optimal, but very few results regarding the form of the optimal Bayesian control policy have appeared in the literature, all limited to a univariate chart design. In this paper, we consider a multivariate Bayesian process mean control problem for a finite production run under the assumption that the observations are values of independent, normally distributed vectors of random variables. The problem is formulated in the POMDP (partially observable Markov decision process) framework and the objective is to determine a control policy minimizing the total expected cost. It is proved that under standard operating and cost assumptions the control limit policy is optimal. Cost comparisons with the benchmark chi-squared chart and the MEWMA chart show that the Bayesian chart is highly cost effective, the savings are larger for smaller values of the critical Mahalanobis distance between the in-control and out-of-control process mean.  相似文献   

6.
7.
The optimal-stopping problem in a partially observable Markov chain is considered, and this is formulated as a Markov decision process. We treat a multiple stopping problem in this paper. Unlike the classical stopping problem, the current state of the chain is not known directly. Information about the current state is always available from an information process. Several properties about the value and the optimal policy are given. For example, if we add another stop action to thek-stop problem, the increment of the value is decreasing ink.The author wishes to thank Professor M. Sakaguchi of Osaka University for his encouragement and guidance. He also thanks the referees for their careful readings and helpful comments.  相似文献   

8.
For statistical decision problems, there are two well-known methods of randomization: on the one hand, randomization by means of mixtures of nonrandomized decision functions (randomized decision rules) in the game “statistician against nature,” on the other hand, randomization by means of randomized decision functions. In this paper, we consider the problem of risk-equivalence of these two procedures, i.e., imposing fairly general conditions on a nonsequential decision problem, it is shown that to each randomized decision rule, there is a randomized decision function with uniformly the same risk, and vice versa. The crucial argument is based on rewriting risk-equivalence in terms of Choquet's integral representation theorem. It is shown, in addition, that for certain special cases that do not fulfill the assumptions of the Main Theorem, risk-equivalence holds at least partially.  相似文献   

9.
The concept of statistical decision theory concerning sequential observations is generalized to decision problems, which are based upon a continuous stochastic process.

In this model decision functions are introduced, consisting of a stopping time and a terminal decision rule. A method of discretization shows the connections between the discrete sequential and the continuous model. Concerning Bayes problems we find, that under certain assumptions the decision problem can be viewed as an optimal stopping problem with continuous time parameter.  相似文献   

10.
In this paper, we study a partially observed recursive optimization problem, which is time inconsistent in the sense that it does not admit the Bellman optimality principle. To obtain the desired results, we establish the Kalman–Bucy filtering equations for a family of parameterized forward and backward stochastic differential equations, which is a Hamiltonian system derived from the general maximum principle for the fully observed time-inconsistency recursive optimization problem. By means of the backward separation technique, the equilibrium control for the partially observed time-inconsistency recursive optimization problem is obtained, which is a feedback of the state filtering estimation. To illustrate the applications of theoretical results, an insurance premium policy problem under partial information is presented, and the observable equilibrium policy is derived explicitly.  相似文献   

11.
In this article, we consider a filtering problem for forward-backward stochastic systems that are driven by Brownian motions and Poisson processes. This kind of filtering problem arises from the study of partially observable stochastic linear-quadratic control problems. Combining forward-backward stochastic differential equation theory with certain classical filtering techniques, the desired filtering equation is established. To illustrate the filtering theory, the theoretical result is applied to solve a partially observable linear-quadratic control problem, where an explicit observable optimal control is determined by the optimal filtering estimation.  相似文献   

12.
In this paper, partially observable Markov decision processes (POMDPs) with discrete state and action space under the average reward criterion are considered from a recent-developed sensitivity point of view. By analyzing the average-reward performance difference formula, we propose a policy iteration algorithm with step sizes to obtain an optimal or local optimal memoryless policy. This algorithm improves the policy along the same direction as the policy iteration does and suitable step sizes guarantee the convergence of the algorithm. Moreover, the algorithm can be used in Markov decision processes (MDPs) with correlated actions. Two numerical examples are provided to illustrate the applicability of the algorithm.  相似文献   

13.
In this paper a continuous-time discounted dynamic programming problem in a Markov decision model is investigated. In many cases it is difficult to search directly for an optimal solution for such a programming problem. We introduce a Lagrangian-type programming problem associated with the original programming problem and show that, under some assumptions, a weak optimal solution exists for the Lagrangian problem. Moreover, we consider the original programming problem in the perturbed programming one and develop the Lagrangian duality.  相似文献   

14.
In this paper, we consider an availability maximization problem for a partially observable system subject to random failure. System deterioration is described by a hidden, continuous-time homogeneous Markov process. While the system is operational, multivariate observations that are stochastically related to the system state are sampled through condition monitoring at discrete time points. The objective is to design an optimal multivariate Bayesian control chart that maximizes the long-run expected average availability per unit time. We have developed an efficient computational algorithm in the semi-Markov decision process (SMDP) framework and showed that the availability maximization problem is equivalent to solving a parameterized system of linear equations. A numerical example is presented to illustrate the effectiveness of our approach, and a comparison with the traditional age-based replacement policy is also provided.  相似文献   

15.
在决策者有限注意力下,现实生活中决策人的选择行为往往表现出一类“满意启发式”特征。基于个体决策者偏好的不完备性,借助方案集系列、考虑集等相关概念探讨了满意启发式决策规则的建模;论证了方案集系列可观与部分可观条件下一类满意决策函数的存在性,以及基于相关理性条件的理性特征,并结合顾客购买行为案例的仿真实验对所建立的满意决策模型进行了验证。仿真结果表明决策者在时间紧逼和信息不完整的情形下可以通过排除部分方案的方式保证以最大概率选到满意方案。研究结果可为现实生活中决策人在时间紧迫、信息缺失等情形下的选择提供一定的理论参考与指导,也可作为一类满意决策研究的理论基础。  相似文献   

16.
In cybersecurity, incomplete inspection, resulting mainly from computers being turned off during the scan, leads to a challenge for scheduling maintenance actions. This article proposes the application of partially observable decision processes to derive cost‐effective cyber maintenance actions that minimize total costs. We consider several types of hosts having vulnerabilities at various levels of severity. The maintenance cost structure in our proposed model consists of the direct costs of maintenance actions in addition to potential incident costs associated with different security states. To assess the benefits of optimal policies obtained from partially observable Markov decision processes, we use real‐world data from a major university. Compared with alternative policies using simulations, the optimal control policies can significantly reduce expected maintenance expenditures per host and relatively quickly mitigate the most important vulnerabilities.  相似文献   

17.
We study the problem of guaranteed positional guidance of a linear partially observable control system with distributed parameters to a convex target set at a given time. The problem is considered under incomplete information. More precisely, we assume that the system is subjected to an unknown disturbance; in addition, the initial state is assumed to be unknown as well. Further, the sets of admissible disturbances and the set of admissible initial states, which is assumed to be finite, are known. An algorithm for solving the problem is suggested.  相似文献   

18.
In this article, by employing Dhage iterative method embodied in current hybrid fixed point theorem (HFPT) of Dhage, we derive an algorithm for the numerical solutions via construction of a sequence of successive approximations for a fractional order boundary value problem (FBVP) with finite delay. By using this technique, we obtain existence as well as approximation of solutions under weaker partial Lipschitz and partial compactness type conditions in a partially ordered Banach space. Additionally, we prove an existence and uniqueness theorem under a weaker partial nonlinear Lipschitz condition. The assumptions and main outcomes are also illustrated by two examples.  相似文献   

19.
The blow-up in finite time for the solutions to the initial-boundary value problem associated to the multi-dimensional quantum hydrodynamic model in a bounded domain is proved. The model consists on conservation of mass equation and a momentum balance equation equivalent to a compressible Euler equations corrected by a dispersion term of the third order in the momentum balance. The proof is based on a priori estimates for the energy functional for a new observable constructed with an auxiliary function, and it is shown that, under suitable boundary conditions and assumptions on the initial data, the solution blows up after a finite time. I.M. Gamba is supported by NSF-DMS0507038. M.P. Gualdani acknowledges partial support from the Deutsche Forschungsgemeinschaft, grants JU359/5 and was partially supported under the Feodor Lynen Research fellowship. P. Zhang is partially supported by the NSF of China under Grant 10525101 and 10421101, and the innovation grant from the Chinese Academy of Sciences. Part of the work was done when P. Zhang visited the Department of Mathematics of Texas University at Austin, the author would like to thank the hospitality of the department. Support from the Institute for Computational Engineering and Sciences at the University of Texas at Austin is also gratefully acknowledged.  相似文献   

20.
Abstract

First, we give a partial solution to the isomorphism problem for uniserial modules of finite length with the help of the morphisms between these modules. Later, under suitable assumptions on the lattice of the submodules, we give a method to partially solve the isomorphism problem for uniserial modules over an arbitrary ring. Particular attention is given to the natural class of uniserial modules defined over algebras given by quivers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号