首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 578 毫秒
1.
本研究了特殊状态需要特殊修理的可修系统的可靠性和诊断策略。假设系统有三种运行状态:正常状态、异常状态、故障状态,有些异常状态和故障状态需要特殊的修理,系统处于哪个状态需要诊断才能知道。每当系统开始正常工作状态后,每隔一段随机时间T进行一次诊断,直到系统故障或被诊断为异常。利用概率分析和向量马尔科夫过程方法,求得了系统的可靠性指标并研究了最优诊断策略。  相似文献   

2.
We analyze mean time to failure and availability of semi-Markov missions that consist of phases with random sequence and durations. It is assumed that the system is a complex one with nonidentical components whose failure properties depend on the mission process. The stochastic structure of the mission is described by a Markov renewal process. We characterize mean time to failure and system availability under the maximal repair policy where the whole system is replaced by a brand new after successfully completing a phase before the next phase starts. Special cases involving Markovian missions are also considered to obtain explicit formulas.  相似文献   

3.
This work focuses on optimal controls for hybrid systems of renewable resources in random environments. We propose a new formulation to treat the optimal exploitation with harvesting and renewing. The random environments are modeled by a Markov chain, which is hidden and can be observed only in a Gaussian white noise. We use the Wonham filter to estimate the state of the Markov chain from the observable process. Then we formulate a harvesting–renewing model under partial observation. The Markov chain approximation method is used to find a numerical approximation of the value function and optimal policies. Our work takes into account natural aspects of the resource exploitation in practice: interacting resources, switching environment, renewing and partial observation. Numerical examples are provided to demonstrate the results and explore new phenomena arising from new features in the proposed model.  相似文献   

4.
We consider a service system with a single server, a finite waiting room and two classes of customers with deterministic service time. Primary jobs arrive at random and are admitted whenever there is room in the system. At the beginning of each period, secondary jobs can be admitted from an infinite pool. A revenue is earned upon admission of each job, with the primary jobs bringing a higher contribution than the secondary jobs, the objective being to maximize the total discounted revenue over an infinite horizon. We model the system as a discrete time Markov decision process and show that a monotone admission policy is optimal when the number of primary arrivals has a fixed distribution. Moreover, when the primary arrival distribution varies with time according to a finite state Markov chain, we show that the optimal policy is conditionally monotone and that the numerical computation of an optimal policy, in this case, is substantially more difficult than in the case of stationary arrivals.This research was supported in part by the National Science Foundation, under grant ECS-8803061, while the author was at the University of Arizona.  相似文献   

5.
We consider a Markov decision process with a Borel state space, bounded rewards, and a bounded transition density satisfying a simultaneous Doeblin-Doob condition. An asymptotics for the discounted value function related to the existence of stationary strong 0-discount optimal policies is extended from the case of finite action sets to the case of compact action sets and continuous in action rewards and transition densities.Supported by NSF grant DMS-9404177  相似文献   

6.
This paper considers a periodic-review shuttle service system with random customer demands and finite reposition capacity. The objective is to find the optimal stationary policy of empty container reposition by minimizing the sum of container leasing cost, inventory cost and reposition cost. Using Markov decision process approach, the structures of the optimal stationary policies for both expected discounted cost and long-run average cost are completely characterized. Monotonic and asymptotic behaviours of the optimal policy are established. By taking advantage of special structure of the optimal policy, the stationary distribution of the system states is obtained, which is then used to compute interesting steady-state performance measures and implement the optimal policy. Numerical examples are given to demonstrate the results.  相似文献   

7.
We consider a discrete time Markov decision process (MDP) with a finite state space, a finite action space, and two kinds of immediate rewards. The problem is to maximize the time average reward generated by one reward stream, subject to the other reward not being smaller than a prescribed value. An MDP with a reward constraint can be solved by linear programming in the range of mixed policies. On the other hand, when we restrict ourselves to pure policies, the problem is a combinatorial problem, for which a solution has not been discovered. In this paper, we propose an approach by Genetic Algorithms (GAs) in order to obtain an effective search process and to obtain a near optimal, possibly optimal pure stationary policy. A numerical example is given to examine the efficiency of the approach proposed.  相似文献   

8.
9.
We treat an inventory control problem in a facility that provides a single type of service for customers. Items used in service are supplied by an outside supplier. To incorporate lost sales due to service delay into the inventory control, we model a queueing system with finite waiting room and non-instantaneous replenishment process and examine the impact of finite buffer on replenishment policies. Employing a Markov decision process theory, we characterize the optimal replenishment policy as a monotonic threshold function of reorder point under the discounted cost criterion. We present a simple procedure that jointly finds optimal buffer size and order quantity.  相似文献   

10.
We consider a finite state Markov process θ, feeding the coefficients of a linear Itô-equation with state ξ. The θ-process is observed in white noise, and it is shown that the optimal nonlinear filter for ξ, is of finite dimension. We also derive finite dimensional equations for optimal prediction and smoothing.  相似文献   

11.
Decision makers often face the need of performance guarantee with some sufficiently high probability. Such problems can be modelled using a discrete time Markov decision process (MDP) with a probability criterion for the first achieving target value. The objective is to find a policy that maximizes the probability of the total discounted reward exceeding a target value in the preceding stages. We show that our formulation cannot be described by former models with standard criteria. We provide the properties of the objective functions, optimal value functions and optimal policies. An algorithm for computing the optimal policies for the finite horizon case is given. In this stochastic stopping model, we prove that there exists an optimal deterministic and stationary policy and the optimality equation has a unique solution. Using perturbation analysis, we approximate general models and prove the existence of e-optimal policy for finite state space. We give an example for the reliability of the satellite sy  相似文献   

12.
13.
This paper deals with a continuous-time Markov decision process in Borel state and action spaces and with unbounded transition rates. Under history-dependent policies, the controlled process may not be Markov. The main contribution is that for such non-Markov processes we establish the Dynkin formula, which plays important roles in establishing optimality results for continuous-time Markov decision processes. We further illustrate this by showing, for a discounted continuous-time Markov decision process, the existence of a deterministic stationary optimal policy (out of the class of history-dependent policies) and characterizing the value function through the Bellman equation.  相似文献   

14.
We study a unichain Markov decision process i.e. a controlled Markov process whose state process under a stationary policy is an ergodic Markov chain. Here the state and action spaces are assumed to be either finite or countable. When the state process is uniformly ergodic and the immediate cost is bounded then a policy that minimizes the long-term expected average cost also has an nth stage sample path cost that with probability one is asymptotically less than the nth stage sample path cost under any other non-optimal stationary policy with a larger expected average cost. This is a strengthening in the Markov model case of the a.s. asymptotically optimal property frequently discussed in the literature.  相似文献   

15.
We consider a discrete-time Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monotonic. An advantage of our approach is that it easily extends to problems with both information and action delays, which are common in applications to high-speed communication networks, among others. The transition probabilities are stochastically monotone and the one-stage reward submodular. We further assume that transitions from different states are coupled, in the sense that the state after a transition is distributed as a deterministic function of the current state and two random variables, one of which is controllable and the other uncontrollable. Finally, we make a monotonicity assumption about the sample-path effect of a pairwise switch of the actions in consecutive stages. Using induction on the horizon length, we demonstrate that optimal policies for the finite- and infinite-horizon discounted problems are monotonic. We apply these results to a single queueing facility with control of arrivals and/or services, under very general conditions. In this case, our results imply that an optimal control policy has threshold form. Finally, we show how monotonicity of an optimal policy extends in a natural way to problems with information and/or action delay, including delays of more than one time unit. Specifically, we show that, if a problem without delay satisfies our sufficient conditions for monotonicity of an optimal policy, then the same problem with information and/or action delay also has monotonic (e.g., threshold) optimal policies.  相似文献   

16.
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of finite horizon MDPs to the infinite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with infinite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems. Based on the convergence of MDPs with a truncated state space to the problem with infinite state space, we show that an optimal stationary policy exists such that the number of randomisations it uses is less or equal to the number of constraints plus one. We finally apply the results to a dynamic scheduling problem.This work was partially supported by the Chateaubriand fellowship from the French embassy in Israel and by the European Grant BRA-QMIPS of CEC DG XIII  相似文献   

17.
Zero entropy processes are known to be deterministic—the past determines the present. We show that each is isomorphic, as a system, to a finitarily deterministic one, i.e., one in which to determine the present from the past it suffices to scan a finite (of random length) portion of the past. In fact we show more: the finitary scanning can be done even if the scanner is noisy and passes only a small fraction of the readings, provided the noise is independent of our system. The main application we present here is that any zero entropy system can be extended to a random Markov process (namely one in which the conditional distribution of the present given the past is a mixture of finite state Markov chains). This allows one to study zero entropy transformations using a procedure completely different from the usual cutting and stacking.  相似文献   

18.
We consider a reparable system with a finite state space, evolving in time according to a semi‐Markov process. The system is stopped for it to be preventively maintained at random times for a random duration. Our aim is to find the preventive maintenance policy that optimizes the stationary availability, whenever it exists. The computation of the stationary availability is based on the fact that the above maintained system evolves according to a semi‐regenerative process. As for the optimization, we observe on numerical examples that it is possible to limit the study to the maintenance actions that begin at deterministic times. We demonstrate this result in a particular case and we study the deterministic maintenance policies in that case. In particular, we show that, if the initial system has an increasing failure rate, the maintenance actions improve the stationary availability if and only if they are not too long on the average, compared to the repairs ( a bound for the mean duration of the maintenance actions is provided). On the contrary, if the initial system has a decreasing failure rate, the maintenance policy lowers the stationary availability. A few other cases are studied. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

19.
Algorithms are described for determining optimal policies for finite state, finite action, infinite discrete time horizon Markov decision processes. Both value-improvement and policy-improvement techniques are used in the algorithms. Computing procedures are also described. The algorithms are appropriate for processes that are either finite or infinite, deterministic or stochastic, discounted or undiscounted, in any meaningful combination of these features. Computing procedures are described in terms of initial data processing, bound improvements, process reduction, and testing and solution. Application of the methodology is illustrated with an example involving natural resource management. Management implications of certain hypothesized relationships between mallard survival and harvest rates are addressed by applying the optimality procedures to mallard population models.  相似文献   

20.
We consider a Markov decision process with an uncountable state space for which the vector performance functional has the form of expected total rewards. Under the single condition that initial distribution and transition probabilities are nonatomic, we prove that the performance space coincides with that generated by nonrandomized Markov policies. We also provide conditions for the existence of optimal policies when the goal is to maximize one component of the performance vector subject to inequality constraints on other components. We illustrate our results with examples of production and financial problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号