首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
We consider continuous-time Markov decision processes in Polish spaces. The performance of a control policy is measured by the expected discounted reward criterion associated with state-dependent discount factors. All underlying Markov processes are determined by the given transition rates which are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. By using the dynamic programming approach, we establish the discounted reward optimality equation (DROE) and the existence and uniqueness of its solutions. Under suitable conditions, we also obtain a discounted optimal stationary policy which is optimal in the class of all randomized stationary policies. Moreover, when the transition rates are uniformly bounded, we provide an algorithm to compute (or?at least to approximate) the discounted reward optimal value function as well as a discounted optimal stationary policy. Finally, we use an example to illustrate our results. Specially, we first derive an explicit and exact solution to the DROE and an explicit expression of a discounted optimal stationary policy for such an example.  相似文献   

3.
4.
We study the long-run average performance of a fluid production/ inventory model which alternates between ON periods and OFF periods. During ON periods of random lengths items are added continuously, at some state-dependent rate, to the inventory. During OFF periods the content decreases (again at some state-dependent rate) back to some basic level. We derive the pertinent reward functionals in closed form. For this analysis the steady-state distributions of the stock level process and its jump counterpart are required. In several examples we use the obtained explicit formulas to maximize the long-run average net revenue numerically.   相似文献   

5.
6.
We study infinite-horizon asymptotic average optimality for parallel server networks with multiple classes of jobs and multiple server pools in the Halfin–Whitt regime. Three control formulations are considered: (1) minimizing the queueing and idleness cost, (2) minimizing the queueing cost under constraints on idleness at each server pool, and (3) fairly allocating the idle servers among different server pools. For the third problem, we consider a class of bounded-queue, bounded-state (BQBS) stable networks, in which any moment of the state is bounded by that of the queue only (for both the limiting diffusion and diffusion-scaled state processes). We show that the optimal values for the diffusion-scaled state processes converge to the corresponding values of the ergodic control problems for the limiting diffusion. We present a family of state-dependent Markov balanced saturation policies (BSPs) that stabilize the controlled diffusion-scaled state processes. It is shown that under these policies, the diffusion-scaled state process is exponentially ergodic, provided that at least one class of jobs has a positive abandonment rate. We also establish useful moment bounds, and study the ergodic properties of the diffusion-scaled state processes, which play a crucial role in proving the asymptotic optimality.  相似文献   

7.
This paper introduces an unified approach to diffusion approximations of signaling networks. This is accomplished by the characterization of a broad class of networks that can be described by a set of quantities which suffer exchanges stochastically in time. We call this class stochastic Petri nets with probabilistic transitions, since it is described as a stochastic Petri net but allows a finite set of random outcomes for each transition. This extension permits effects on the network which are commonly interpreted as “routing” in queueing systems. The class is general enough to include, for instance, G-networks with negative customers and triggers as a particular case. With this class at hand, we derive a heavy traffic approximation, where the processes that drive the transitions are given by state-dependent Poisson-type processes and where the probabilities of the random outcomes are also state-dependent. The objective of this approach is to have a diffusion approximation which can be readily applied in several practical problems. We illustrate the use of the results with some numerical experiments.  相似文献   

8.
9.
We develop a generic game platform that can be used to model various real-world systems with multiple intelligent cloud-computing pools and parallel-queues for resources-competing users. Inside the platform, the software structure is modelled as Blockchain. All the users are associated with Big Data arrival streams whose random dynamics is modelled by triply stochastic renewal reward processes (TSRRPs). Each user may be served simultaneously by multiple pools while each pool with parallel-servers may also serve multi-users at the same time via smart policies in the Blockchain, e.g. a Nash equilibrium point myopically at each fixed time to a game-theoretic scheduling problem. To illustrate the effectiveness of our game platform, we model the performance measures of its internal data flow dynamics (queue length and workload processes) as reflecting diffusion with regime-switchings (RDRSs) under our scheduling policies. By RDRS models, we can prove our myopic game-theoretic policy to be an asymptotic Pareto minimal-dual-cost Nash equilibrium one globally over the whole time horizon to a randomly evolving dynamic game problem. Iterative schemes for simulating our multi-dimensional RDRS models are also developed with the support of numerical comparisons.  相似文献   

10.
We introduce semi-Markov fields and provide formulations for the basic terms in the semi-Markov theory. In particular we define and consider a class of associated reward fields. Then we present a formula for the expected reward at any multidimensional time epoch. The formula is indeed new even for the classical semi-Markov processes. It gives the expected cumulative reward for fairly large classes of reward functions; in particular, it provides the formulas for the expected cumulative reward given in Masuda and Sumitau (1991), Soltani (1996) and Soltani and Khorshidian (1998).  相似文献   

11.
In this paper, we study constrained continuous-time Markov decision processes with a denumerable state space and unbounded reward/cost and transition rates. The criterion to be maximized is the expected average reward, and a constraint is imposed on an expected average cost. We give suitable conditions that ensure the existence of a constrained-optimal policy. Moreover, we show that the constrained-optimal policy randomizes between two stationary policies differing in at most one state. Finally, we use a controlled queueing system to illustrate our conditions. Supported by NSFC, NCET and RFDP.  相似文献   

12.
13.
14.
We study distributed algorithms for solving global optimization problems in which the objective function is the sum of local objective functions of agents and the constraint set is given by the intersection of local constraint sets of agents. We assume that each agent knows only his own local objective function and constraint set, and exchanges information with the other agents over a randomly varying network topology to update his information state. We assume a state-dependent communication model over this topology: communication is Markovian with respect to the states of the agents and the probability with which the links are available depends on the states of the agents. We study a projected multi-agent subgradient algorithm under state-dependent communication. The state-dependence of the communication introduces significant challenges and couples the study of information exchange with the analysis of subgradient steps and projection errors. We first show that the multi-agent subgradient algorithm when used with a constant stepsize may result in the agent estimates to diverge with probability one. Under some assumptions on the stepsize sequence, we provide convergence rate bounds on a “disagreement metric” between the agent estimates. Our bounds are time-nonhomogeneous in the sense that they depend on the initial starting time. Despite this, we show that agent estimates reach an almost sure consensus and converge to the same optimal solution of the global optimization problem with probability one under different assumptions on the local constraint sets and the stepsize sequence.  相似文献   

15.
Motivated by problems in behavioural finance, we provide two explicit constructions of a randomized stopping time which embeds a given centred distribution μ on integers into a simple symmetric random walk in a uniformly integrable manner. Our first construction has a simple Markovian structure: at each step, we stop if an independent coin with a state-dependent bias returns tails. Our second construction is a discrete analogue of the celebrated Azéma–Yor solution and requires independent coin tosses only when excursions away from maximum breach predefined levels. Further, this construction maximizes the distribution of the stopped running maximum among all uniformly integrable embeddings of μ.  相似文献   

16.
We consider state-dependent stochastic networks in the heavy-traffic diffusion limit represented by reflected jump-diffusions in the orthant ℝ+ n with state-dependent reflection directions upon hitting boundary faces. Jumps are allowed in each coordinate by means of independent Poisson random measures with jump amplitudes depending on the state of the process immediately before each jump. For this class of reflected jump-diffusion processes sufficient conditions for the existence of a product-form stationary density and an ergodic characterization of the stationary distribution are provided. Moreover, such stationary density is characterized in terms of semi-martingale local times at the boundaries and it is shown to be continuous and bounded. A central role is played by a previously established semi-martingale local time representation of the regulator processes. F.J. Piera’s research supported in part by CONICYT, Chile, FONDECYT Project 1070797. R.R. Mazumdar’s research supported in part by NSF, USA, Grant 0087404 through Networking Research Program, and a Discovery Grant from NSERC, Canada.  相似文献   

17.
18.
In this paper, we consider a class of nonlinear autoregressive (AR) processes with state-dependent switching, which are two-component Markov processes. The state-dependent switching model is a nontrivial generalization of Markovian switching formulation and it includes the Markovian switching as a special case. We prove the Feller and strong Feller continuity by means of introducing auxiliary processes and making use of the Radon-Nikodym derivatives. Then, we investigate the geometric ergodicity by the Foster-Lyapunov inequality. Moreover, we establish the V-uniform ergodicity by means of introducing additional auxiliary processes and by virtue of constructing certain order-preserving couplings of the original as well as the auxiliary processes. In addition, illustrative examples are provided for demonstration.  相似文献   

19.
Let µ1,...,µ k be d-dimensional probabilitymeasures in ? d with mean 0. At each time we choose one of the measures based on the history of the process and take a step according to that measure. We give conditions for transience of such processes and also construct examples of recurrent processes of this type. In particular, in dimension 3 we give the complete picture: every walk generated by two measures is transient and there exists a recurrent walk generated by three measures.  相似文献   

20.
We derive simple criteria to ensure the finiteness of the mean first-passage times into semi-infinite sets for real-valued skip-free processes, which move by random jumps in one direction and deterministically, for a random time, in the other. These criteria are applied to state-dependent queueing systems, to give finiteness of the mean of the busy period, and to a self-correcting birth process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号