期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Comparative studies on dynamic programming and integer programming approaches for concave cost production/inventory control problems

Hiroshi Konno Takaaki Egawa Rei Yamamoto 《Computational Management Science》2009,6(4):447-457

This paper is concerned with classical concave cost multi-echelon production/inventory control problems studied by W. Zangwill and others. It is well known that the problem with m production steps and n time periods can be solved by a dynamic programming algorithm in O(n ⁴ m) steps, which is considered as the fastest algorithm for solving this class of problems. In this paper, we will show that an alternative 0–1 integer programming approach can solve the same problem much faster particularly when n is large and the number of 0–1 integer variables is relatively few. This class of problems include, among others problem with set-up cost function and piecewise linear cost function with fewer linear pieces. The new approach can solve problems with mixed concave/convex cost functions, which cannot be solved by dynamic programming algorithms. 相似文献

2.

Time aggregated Markov decision processes via standard dynamic programming

Edilson F. Arruda 《Operations Research Letters》2011,39(3):193-197

This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms. 相似文献

3.

On computing average cost optimal policies with application to routing to parallel queues

Linn I. Sennott 《Mathematical Methods of Operations Research》1997,45(1):45-62

The Approximating Sequence Method for computation of average cost optimal stationary policies in denumerahle state Markov decision chains, introduced in Sennott (1994), is reviewed. New methods for verifying the assumptions are given. These are useful for models with multidimensional state spaces that satisfy certain mild structural properties. The results are applied to four problems in the optimal routing of packets to parallel queues. Numerical results are given for one of the models. 相似文献

4.

A dynamic programming approach to the maximum principle of distributed-parameter systems

S. Fond 《Journal of Optimization Theory and Applications》1979,27(4):583-601

This paper provides a dynamic programming approach to the maximum principle for the optimal control of systems with distributed parameters. The process of the systems under consideration is governed by a partial differential equation.This paper is based on Chapter 2 of the author's PhD Thesis under the supervision of Professor S. E. Dreyfus to whom the author wishes to express his appreciation. 相似文献

5.

An explicit linear solution for the quadratic dynamic programming problem

W. R. S. Sutherland H. Wolkowicz V. Zeidan 《Journal of Optimization Theory and Applications》1988,58(2):319-330

For a given vectorx ₀, the sequence {x _t} which optimizes the sum of discounted rewardsr(x _t, x_t+1), wherer is a quadratic function, is shown to be generated by a linear decision rulex _t+1=Sx _t+R. Moreover, the coefficientsR,S are given by explicit formulas in terms of the coefficients of the reward functionr. A unique steady-state is shown to exist (except for a degenerate case), and its stability is discussed. 相似文献

6.

Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Martijn van Otterlo 《Journal of Algorithms in Cognition, Informatics and Logic》2009,64(4):169-191

We present intensional dynamic programming (IDP), a generic framework for structured dynamic programming over atomic, propositional and relational representations of states and actions. We first develop set-based dynamic programming and show its equivalence with classical dynamic programming. We then show how to describe state sets intensionally using any form of structured knowledge representation and obtain a generic algorithm that can optimally solve large, even infinite, MDPs without explicit state space enumeration. We derive two new Bellman backup operators and algorithms. In order to support the view of IDP as a Rosetta stone for structured dynamic programming, we review many existing techniques that employ either propositional or relational knowledge representation frameworks. 相似文献

7.

Dynamic programming analysis of the TV game “Who wants to be a millionaire?”

Federico Perea Justo Puerto 《European Journal of Operational Research》2007

This paper uses dynamic programming to investigate when contestants should use lifelines or when they should just stop answering in the TV quiz show ‘Who wants to be a millionaire?’. It obtains the optimal strategies to maximize the expected reward and to maximize the probability of winning a given amount of money. 相似文献

8.

Convergence of Markov decision processes with constraints and state-action dependent discount factors

Wu Xiao Guo Xianping 《中国科学数学(英文版)》2020,63(1):167-182

This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation. 相似文献

9.

Sensitivity analysis in discrete dynamic programming

W. J. Hopp 《Journal of Optimization Theory and Applications》1988,56(2):257-269

The problem of characterizing the minimum perturbations to parameters in future stages of a discrete dynamic program necessary to change the optimal first policy is considered. Lower bounds on these perturbations are derived and used to establish ranges for the reward functions over which the optimal first policy is robust. A numerical example is presented to illustrate factors affecting the tightness of these bounds. 相似文献

10.

Approximate dynamic programming via direct search in the space of value function approximations

E.F. Arruda M.D. Fragoso 《European Journal of Operational Research》2011,211(2):343-351

This paper deals with approximate value iteration (AVI) algorithms applied to discounted dynamic programming (DP) problems. For a fixed control policy, the span semi-norm of the so-called Bellman residual is shown to be convex in the Banach space of candidate solutions to the DP problem. This fact motivates the introduction of an AVI algorithm with local search that seeks to minimize the span semi-norm of the Bellman residual in a convex value function approximation space. The novelty here is that the optimality of a point in the approximation architecture is characterized by means of convex optimization concepts and necessary and sufficient conditions to local optimality are derived. The procedure employs the classical AVI algorithm direction (Bellman residual) combined with a set of independent search directions, to improve the convergence rate. It has guaranteed convergence and satisfies, at least, the necessary optimality conditions over a prescribed set of directions. To illustrate the method, examples are presented that deal with a class of problems from the literature and a large state space queueing problem setting. 相似文献

11.

Approximate dynamic programming for stochastic linear control problems on compact state spaces

Stefan Woerner Marco Laumanns Rico Zenklusen Apostolos Fertis 《European Journal of Operational Research》2015

This paper addresses Markov Decision Processes over compact state and action spaces. We investigate the special case of linear dynamics and piecewise-linear and convex immediate costs for the average cost criterion. This model is very general and covers many interesting examples, for instance in inventory management. Due to the curse of dimensionality, the problem is intractable and optimal policies usually cannot be computed, not even for instances of moderate size. 相似文献

12.

Fractional programming approach to fuzzy weighted average 总被引：15，自引：0，他引：15

Chiang Kao Shiang-Tai Liu 《Fuzzy Sets and Systems》2001,120(3):435-444

This paper proposes a fractional programming approach to construct the membership function for fuzzy weighted average. Based on the -cut representation of fuzzy sets and the extension principle, a pair of fractional programs is formulated to find the -cut of fuzzy weighted average. Owing to the special structure of the fractional programs, in most cases, the optimal solution can be found analytically. Consequently, the exact form of the membership function can be derived by taking the inverse function of the -cut. For other cases, a discrete but exact solution to fuzzy weighted average is provided via an efficient solution method. Examples are given for illustration. 相似文献

13.

On the optimality equation for average cost Markov control processes with Feller transition probabilities

Anna Ja?kiewicz Andrzej S. Nowak 《Journal of Mathematical Analysis and Applications》2006,316(2):495-509

We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation. 相似文献

14.

Stochastic inventory problem with piecewise quadratic holding cost function containing a cost-free interval

M. Parlar R. Rempala 《Journal of Optimization Theory and Applications》1992,75(1):133-153

In this paper, we consider a periodic-review stochastic inventory model with an asymmetric or piecewise-quadratic holding cost function and nonnegative production levels. It is assumed that the cost of deviating from an ideal production level or existing capacity is symmetric quadratic. It is shown that the optimal order policy is similar to the (s, S) policies found in the literature, except that the order-up-to quantity is a nonlinear function of the entering inventory level. Dynamic programming is used to derive the optimal policy. We provide numerical examples and a sensitivity analysis on the problem parameters.This research was supported by the Natural Sciences and Engineering Research Council of Canada under Grant No. A5872. The authors wish to thank an anonymous referee for very helpful comments on an earlier version of this paper. 相似文献

15.

Lower bound for the mean project completion time in dynamic PERT networks

Amir Azaron S.M.T. Fatemi Ghomi 《European Journal of Operational Research》2008

We apply the stochastic dynamic programming to obtain a lower bound for the mean project completion time in a PERT network, where the activity durations are exponentially distributed random variables. Moreover, these random variables are non-static in that the distributions themselves vary according to some randomness in society like strike or inflation. This social randomness is modelled as a function of a separate continuous-time Markov process over the time horizon. The results are verified by simulation. 相似文献

16.

On a class of dynamic programming problems whose optimal controls and states are independent of the future

Hans Ulrich Buhl Johannes Siedersleben 《European Journal of Operational Research》1984,18(3):364-368

When applying dynamic programming for optimal decision making one usually needs considerable knowledge about the future. This knowledge, e.g. about future functions and parameters, necessary to determine optimal control policies, however, is often not available and thus precludes the application of dynamic programming.In the present paper it is shown that for a certain class of dynamic programming problems the optimal control policy is independent of the future. To illustrate the results an application in inventory control is given and further applications in the theories of economic growth and corporate finance are listed in the references. 相似文献

17.

Time and inventory dependent optimal maintenance policies for single machine workstations: An MDP approach

J.S. Borrero R. Akhavan-Tabatabaei 《European Journal of Operational Research》2013

In this work the problem of obtaining an optimal maintenance policy for a single-machine, single-product workstation that deteriorates over time is addressed, using Markov Decision Process (MDP) models. Two models are proposed. The decision criteria for the first model is based on the cost of performing maintenance, the cost of repairing a failed machine and the cost of holding inventory while the machine is not available for production. For the second model the cost of holding inventory is replaced by the cost of not satisfying the demand. The processing time of jobs, inter-arrival times of jobs or units of demand, and the failure times are assumed to be random. The results show that in order to make better maintenance decisions the interaction between the inventory (whether in process or final), and the number of shifts that the machine has been working without restoration, has to be taken into account. If this interaction is considered, the long-run operational costs are reduced significantly. Moreover, structural properties of the optimal policies of the models are obtained after imposing conditions on the parameters of the models and on the distribution of the lifetime of a recently restored machine. 相似文献

18.

Unbounded cost Markov decision processes with limsup and liminf average criteria: new conditions

Quanxin Zhu Xianping Guo Yonglong Dai 《Mathematical Methods of Operations Research》2005,61(3):469-482

相似文献

19.

The dynamic programming approach to multi-model robust optimization

Vadim Azhmyakov Vladimir Boltyanski 《Nonlinear Analysis: Theory, Methods & Applications》2010,72(2):1110-1119

The aim of this paper is to extend the dynamic programming (DP) approach to multi-model optimal control problems (OCPs). We deal with robust optimization of multi-model control systems and are particularly interested in the Hamilton-Jacobi-Bellman (HJB) equation for the above class of problems. In this paper, we study a variant of the HJB for multi-model OCPs and examine the natural relationship between the Bellman DP techniques and the Robust Maximum Principle (MP). Moreover, we describe how to carry out the practical calculations in the context of multi-model LQ-problems and derive the associated Riccati-type equation. 相似文献

20.

Denumerable state semi-Markov decision processes with unbounded costs,average cost criterion

A. Federgruen A. Hordijk H.C. Tijms 《Stochastic Processes and their Applications》1979,9(2):223-235

This paper establishes a rather complete optimality theory for the average cost semi-Markov decision model with a denumerable state space, compact metric action sets and unbounded one-step costs for the case where the underlying Markov chains have a single ergotic set. Under a condition which, roughly speaking, requires the existence of a finite set such that the supremum over all stationary policies of the expected time and the total expected absolute cost incurred until the first return to this set are finite for any starting state, we shall verify the existence of a finite solution to the average costs optimality equation and the existence of an average cost optimal stationary policy. 相似文献