首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
《Optimization》2012,61(4):773-800
Abstract

In this paper we study the risk-sensitive average cost criterion for continuous-time Markov decision processes in the class of all randomized Markov policies. The state space is a denumerable set, and the cost and transition rates are allowed to be unbounded. Under the suitable conditions, we establish the optimality equation of the auxiliary risk-sensitive first passage optimization problem and obtain the properties of the corresponding optimal value function. Then by a technique of constructing the appropriate approximating sequences of the cost and transition rates and employing the results on the auxiliary optimization problem, we show the existence of a solution to the risk-sensitive average optimality inequality and develop a new approach called the risk-sensitive average optimality inequality approach to prove the existence of an optimal deterministic stationary policy. Furthermore, we give some sufficient conditions for the verification of the simultaneous Doeblin condition, use a controlled birth and death system to illustrate our conditions and provide an example for which the risk-sensitive average optimality strict inequality occurs.  相似文献   

2.
3.
We consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with anuncountable state space, namely the space of probability distributions on the original core state space. By developing a suitable theoretical framework, it is shown that some characteristics induced in the original problem due to the countability of the spaces involved are reflected onto the equivalent problem. Sufficient conditions are then derived for solutions to the average cost optimality equation to exist. We illustrate these results in the context of machine replacement problems. Structural properties for average cost optimal policies are obtained for a two state replacement problem; these are similar to results available for discount optimal policies. The set of assumptions used compares favorably to others currently available.This research was supported in part by the Advanced Technology Program of the State of Texas, in part by the Air Force Office of Scientific Research under Grant AFOSR-86-0029, in part by the National Science Foundation under Grant ECS-8617860, and in part by the Air Force Office of Scientific Research (AFSC) under Contract F49620-89-C-0044.  相似文献   

4.
We revisit the problem of job assignment to multiple heterogeneous servers in parallel. The system under consideration, however, has a few unique features. Specifically, repair jobs arrive to the queueing system in batches according to a Poisson process. In addition, servers are heterogeneous and the service time distributions of the individual servers are general. The objective is to optimally assign each job within a batch arrival to minimize the long-run average number of jobs in the entire system. We focus on the class of static assignment policies where jobs are routed to servers upon arrival according to pre-determined probabilities. We solve the model analytically and derive the structural properties of the optimal static assignment. We show that when the traffic is below a certain threshold, it is better to not assign any jobs to slower servers. As traffic increases (either due to an increase in job arrival rate or batch size), more slower servers will be utilized. We give an explicit formula for computing the threshold. Finally we compare and evaluate the performance of the static assignment policy to two dynamic policies, specifically the shortest expected completion policy and the shortest queue policy.  相似文献   

5.
In this paper we study zero-sum stochastic games. The optimality criterion is the long-run expected average criterion, and the payoff function may have neither upper nor lower bounds. We give a new set of conditions for the existence of a value and a pair of optimal stationary strategies. Our conditions are slightly weaker than those in the previous literature, and some new sufficient conditions for the existence of a pair of optimal stationary strategies are imposed on the primitive data of the model. Our results are illustrated with a queueing system, for which our conditions are satisfied but some of the conditions in some previous literatures fail to hold.  相似文献   

6.
We present an average case analysis of the minimum spanning tree heuristic for the power assignment problem. The worst‐case approximation ratio of this heuristic is 2. We show that in Euclidean d‐dimensional space, when the vertex set consists of a set of i.i.d. uniform random independent, identically distributed random variables in [0,1]d, and the distance power gradient equals the dimension d, the minimum spanning tree‐based power assignment converges completely to a constant depending only on d.  相似文献   

7.
We treat non-cooperative stochastic games with countable state space and with finitely many players each having finitely many moves available in a given state. As a function of the current state and move vector, each player incurs a nonnegative cost. Assumptions are given for the expected discounted cost game to have a Nash equilibrium randomized stationary strategy. These conditions hold for bounded costs, thereby generalizing Parthasarathy (1973) and Federgruen (1978). Assumptions are given for the long-run average expected cost game to have a Nash equilibrium randomized stationary strategy, under which each player has constant average cost. A flow control example illustrates the results. This paper complements the treatment of the zero-sum case in Sennott (1993a).  相似文献   

8.
We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation.  相似文献   

9.
We consider a model of a multipath routing system where arriving customers are routed to a set of identical, parallel, single server queues according to balancing policies operating without state information. After completion of service, customers are required to leave the system in their order of arrival, thus incurring an additional resequencing delay. We are interested in minimizing the end-to-end delay (including time at the resequencing buffer) experienced by arriving customers. To that end we establish the optimality of the Round–Robin routing assignment in two asymptotic regimes, namely heavy and light traffic: In heavy traffic, the Round–Robin customer assignment is shown to achieve the smallest (in the increasing convex stochastic ordering) end-to-end delay amongst all routing policies operating without queue state information. In light traffic, and for the special case of Poisson arrivals, we show that Round–Robin is again an optimal (in the strong stochastic ordering) routing policy. We illustrate the stochastic comparison results by several simulation examples. The work of the first author was supported through an ARCHIMEDES grant by the Greek Ministry of Education. The work of the second author was prepared through collaborative participation in the Communications and Networks Consortium sponsored by the U.S. Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement DAAD19-01-2-0011. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Laboratory or the U.S. Government.  相似文献   

10.
In a decision process (gambling or dynamic programming problem) with finite state space and arbitrary decision sets (gambles or actions), there is always available a Markov strategy which uniformly (nearly) maximizes the average time spent at a goal. If the decision sets are closed, there is even a stationary strategy with the same property.Examples are given to show that approximations by discounted or finite horizon payoffs are not useful for the general average reward problem.  相似文献   

11.
This paper deals with the average expected reward criterion for continuous-time Markov decision processes in general state and action spaces. The transition rates of underlying continuous-time jump Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We give conditions on the system's primitive data and under which we prove the existence of the average reward optimality equation and an average optimal stationary policy. Also, under our conditions we ensure the existence of ?-average optimal stationary policies. Moreover, we study some properties of average optimal stationary policies. We not only establish another average optimality equation on an average optimal stationary policy, but also present an interesting “martingale characterization” of such a policy. The approach provided in this paper is based on the policy iteration algorithm. It should be noted that our way is rather different from both the usually “vanishing discounting factor approach” and the “optimality inequality approach” widely used in the previous literature.  相似文献   

12.
《Optimization》2012,61(7):1593-1623
This paper deals with the ratio and time expected average criteria for constrained semi-Markov decision processes (SMDPs). The state and action spaces are Polish spaces, the rewards and costs are unbounded from above and from below, and the mean holding times are allowed to be unbounded from above. First, under general conditions we prove the existence of constrained-optimal policies for the ratio expected average criterion by developing a technique of occupation measures including the mean holding times for SMDPs, which are the generalizations of those for the standard discrete-time and continuous-time MDPs. Then, we give suitable conditions under which we establish the equivalence of the two average criteria by the optional sampling theorem, and thus we show the existence of constrained-optimal policies for the time expected average criterion. Finally, we illustrate the application of our main results with a controlled linear system, for which an exact optimal policy is obtained.  相似文献   

13.
We examine the resource allocation problem of partitioning identical servers into two parallel pooling centers, and simultaneously assigning job types to pooling centers. Each job type has a distinct Poisson arrival rate and a distinct holding cost per unit time. Each pooling center becomes a queueing system with an exponential service time distribution. The goal is to minimize the total holding cost. The problem is shown to be polynomial if a job type can be divided between the pooling centers, and NP-hard if dividing job types is not possible. When there are two servers and jobs cannot be divided, we demonstrate that the two pooling center configuration is rarely optimal. A heuristic which checks the single pooling center has an upper bound on the relative error of 4/3. The heuristic is extended for the multiple server problem, where relative error is bounded above by the number of servers.   相似文献   

14.
In this paper, we consider scalar linear stochastic differential games with average cost criterions. We solve the dynamic programming equations for these games and give the synthesis of saddle-point and Nash equilibrium solutions.The authors wish to thank A. Ichikawa for providing the initial impetus and helpful advice.  相似文献   

15.
16.
In this study, we introduce a cooperative parallel tabu search algorithm (CPTS) for the quadratic assignment problem (QAP). The QAP is an NP-hard combinatorial optimization problem that is widely acknowledged to be computationally demanding. These characteristics make the QAP an ideal candidate for parallel solution techniques. CPTS is a cooperative parallel algorithm in which the processors exchange information throughout the run of the algorithm as opposed to independent concurrent search strategies that aggregate data only at the end of execution. CPTS accomplishes this cooperation by maintaining a global reference set which uses the information exchange to promote both intensification and strategic diversification in a parallel environment. This study demonstrates the benefits that may be obtained from parallel computing in terms of solution quality, computational time and algorithmic flexibility. A set of 41 test problems obtained from QAPLIB were used to analyze the quality of the CPTS algorithm. Additionally, we report results for 60 difficult new test instances. The CPTS algorithm is shown to provide good solution quality for all problems in acceptable computational times. Out of the 41 test instances obtained from QAPLIB, CPTS is shown to meet or exceed the average solution quality of many of the best sequential and parallel approaches from the literature on all but six problems, whereas no other leading method exhibits a performance that is superior to this.  相似文献   

17.
This article studies the circular chromatic number of a class of circular partitionable graphs. We prove that an infinite family of circular partitionable graphs G has . A consequence of this result is that we obtain an infinite family of graphs G with the rare property that the deletion of each vertex decreases its circular chromatic number by exactly 1. © 2006 Wiley Periodicals, Inc. J Graph Theory  相似文献   

18.
The bounds are obtained for the average crosscap number. Let G be a graph which is not a tree. It is shown that the average crosscap number of G is not less than 2β(G)-1/2β(G)-1β(G)β(G) and not larger than/β(G). Furthermore, we also describe the structure of the graphs which attain the bounds of the average crosscap number.  相似文献   

19.
The bounds are obtained for the average crosscap number. Let G be a graph which is not a tree. It is shown that the average crosscap number of G is not less thanβ(G)-1/2β(G)-1β(G) and not larger thanβ(G). Furthermore, we also describe the structure of the graphs which attain the bounds of the average crosscap number.  相似文献   

20.
Consider n jobs and two machines. Each job has to be processed on both machines. The order in which it is dome is immaterial. However, the decision maker has to decide in advance which jobs will be processes first on machine 1 (2). We assume that processing times on each machine are identically exponentially distributed random variables. We prove that assigning equal number of jobs to be first processed by machine 1 (2) stochastically minimizes the makespan.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号