首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
This paper is concerned with classical concave cost multi-echelon production/inventory control problems studied by W. Zangwill and others. It is well known that the problem with m production steps and n time periods can be solved by a dynamic programming algorithm in O(n 4 m) steps, which is considered as the fastest algorithm for solving this class of problems. In this paper, we will show that an alternative 0–1 integer programming approach can solve the same problem much faster particularly when n is large and the number of 0–1 integer variables is relatively few. This class of problems include, among others problem with set-up cost function and piecewise linear cost function with fewer linear pieces. The new approach can solve problems with mixed concave/convex cost functions, which cannot be solved by dynamic programming algorithms.  相似文献   

2.
This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms.  相似文献   

3.
The Approximating Sequence Method for computation of average cost optimal stationary policies in denumerahle state Markov decision chains, introduced in Sennott (1994), is reviewed. New methods for verifying the assumptions are given. These are useful for models with multidimensional state spaces that satisfy certain mild structural properties. The results are applied to four problems in the optimal routing of packets to parallel queues. Numerical results are given for one of the models.  相似文献   

4.
This paper provides a dynamic programming approach to the maximum principle for the optimal control of systems with distributed parameters. The process of the systems under consideration is governed by a partial differential equation.This paper is based on Chapter 2 of the author's PhD Thesis under the supervision of Professor S. E. Dreyfus to whom the author wishes to express his appreciation.  相似文献   

5.
For a given vectorx 0, the sequence {x t} which optimizes the sum of discounted rewardsr(x t, xt+1), wherer is a quadratic function, is shown to be generated by a linear decision rulex t+1=Sx t +R. Moreover, the coefficientsR,S are given by explicit formulas in terms of the coefficients of the reward functionr. A unique steady-state is shown to exist (except for a degenerate case), and its stability is discussed.  相似文献   

6.
We present intensional dynamic programming (IDP), a generic framework for structured dynamic programming over atomic, propositional and relational representations of states and actions. We first develop set-based dynamic programming and show its equivalence with classical dynamic programming. We then show how to describe state sets intensionally using any form of structured knowledge representation and obtain a generic algorithm that can optimally solve large, even infinite, MDPs without explicit state space enumeration. We derive two new Bellman backup operators and algorithms. In order to support the view of IDP as a Rosetta stone for structured dynamic programming, we review many existing techniques that employ either propositional or relational knowledge representation frameworks.  相似文献   

7.
This paper uses dynamic programming to investigate when contestants should use lifelines or when they should just stop answering in the TV quiz show ‘Who wants to be a millionaire?’. It obtains the optimal strategies to maximize the expected reward and to maximize the probability of winning a given amount of money.  相似文献   

8.
This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation.  相似文献   

9.
The problem of characterizing the minimum perturbations to parameters in future stages of a discrete dynamic program necessary to change the optimal first policy is considered. Lower bounds on these perturbations are derived and used to establish ranges for the reward functions over which the optimal first policy is robust. A numerical example is presented to illustrate factors affecting the tightness of these bounds.  相似文献   

10.
This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks to minimize the span semi-norm of the Bellman residual in a convex value function approximation space. The novelty here is that the optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to local optimality are derived. The procedure employs the classical AVI algorithm direction (Bellman residual) combined with a set of independent search directions, to improve the convergence rate. It has guaranteed convergence and satisfies, at least, the necessary optimality conditions over a prescribed set of directions. To illustrate the method, examples are presented that deal with a class of problems from the literature and a large state space queueing problem setting.  相似文献   

11.
This paper addresses Markov Decision Processes over compact state and action spaces. We investigate the special case of linear dynamics and piecewise-linear and convex immediate costs for the average cost criterion. This model is very general and covers many interesting examples, for instance in inventory management. Due to the curse of dimensionality, the problem is intractable and optimal policies usually cannot be computed, not even for instances of moderate size.  相似文献   

12.
Fractional programming approach to fuzzy weighted average   总被引:15,自引:0,他引:15  
This paper proposes a fractional programming approach to construct the membership function for fuzzy weighted average. Based on the -cut representation of fuzzy sets and the extension principle, a pair of fractional programs is formulated to find the -cut of fuzzy weighted average. Owing to the special structure of the fractional programs, in most cases, the optimal solution can be found analytically. Consequently, the exact form of the membership function can be derived by taking the inverse function of the -cut. For other cases, a discrete but exact solution to fuzzy weighted average is provided via an efficient solution method. Examples are given for illustration.  相似文献   

13.
We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation.  相似文献   

14.
In this paper, we consider a periodic-review stochastic inventory model with an asymmetric or piecewise-quadratic holding cost function and nonnegative production levels. It is assumed that the cost of deviating from an ideal production level or existing capacity is symmetric quadratic. It is shown that the optimal order policy is similar to the (s, S) policies found in the literature, except that the order-up-to quantity is a nonlinear function of the entering inventory level. Dynamic programming is used to derive the optimal policy. We provide numerical examples and a sensitivity analysis on the problem parameters.This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grant No. A5872. The authors wish to thank an anonymous referee for very helpful comments on an earlier version of this paper.  相似文献   

15.
We apply the stochastic dynamic programming to obtain a lower bound for the mean project completion time in a PERT network, where the activity durations are exponentially distributed random variables. Moreover, these random variables are non-static in that the distributions themselves vary according to some randomness in society like strike or inflation. This social randomness is modelled as a function of a separate continuous-time Markov process over the time horizon. The results are verified by simulation.  相似文献   

16.
When applying dynamic programming for optimal decision making one usually needs considerable knowledge about the future. This knowledge, e.g. about future functions and parameters, necessary to determine optimal control policies, however, is often not available and thus precludes the application of dynamic programming.In the present paper it is shown that for a certain class of dynamic programming problems the optimal control policy is independent of the future. To illustrate the results an application in inventory control is given and further applications in the theories of economic growth and corporate finance are listed in the references.  相似文献   

17.
In this work the problem of obtaining an optimal maintenance policy for a single-machine, single-product workstation that deteriorates over time is addressed, using Markov Decision Process (MDP) models. Two models are proposed. The decision criteria for the first model is based on the cost of performing maintenance, the cost of repairing a failed machine and the cost of holding inventory while the machine is not available for production. For the second model the cost of holding inventory is replaced by the cost of not satisfying the demand. The processing time of jobs, inter-arrival times of jobs or units of demand, and the failure times are assumed to be random. The results show that in order to make better maintenance decisions the interaction between the inventory (whether in process or final), and the number of shifts that the machine has been working without restoration, has to be taken into account. If this interaction is considered, the long-run operational costs are reduced significantly. Moreover, structural properties of the optimal policies of the models are obtained after imposing conditions on the parameters of the models and on the distribution of the lifetime of a recently restored machine.  相似文献   

18.
19.
The aim of this paper is to extend the dynamic programming (DP) approach to multi-model optimal control problems (OCPs). We deal with robust optimization of multi-model control systems and are particularly interested in the Hamilton-Jacobi-Bellman (HJB) equation for the above class of problems. In this paper, we study a variant of the HJB for multi-model OCPs and examine the natural relationship between the Bellman DP techniques and the Robust Maximum Principle (MP). Moreover, we describe how to carry out the practical calculations in the context of multi-model LQ-problems and derive the associated Riccati-type equation.  相似文献   

20.
This paper establishes a rather complete optimality theory for the average cost semi-Markov decision model with a denumerable state space, compact metric action sets and unbounded one-step costs for the case where the underlying Markov chains have a single ergotic set. Under a condition which, roughly speaking, requires the existence of a finite set such that the supremum over all stationary policies of the expected time and the total expected absolute cost incurred until the first return to this set are finite for any starting state, we shall verify the existence of a finite solution to the average costs optimality equation and the existence of an average cost optimal stationary policy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号