首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Summary A general discrete decision process is formulated which includes both undiscounted and discounted semi-Markovian decision processes as special cases. A policy-iteration algorithm is presented and shown to converge to an optimal policy. Properties of the coupled functional equations are derived. Primal and dual linear programming formulations of the optimization problem are also given. An application is given to Markov ratio decision process.
Zusammenfassung Es wird ein allgemeiner diskreter Entscheidungsprozeß betrachtet, welcher sowohl die undiskontierten und diskontierten Semi-Markoffschen Entscheidungsprozesse als Spezialfälle enthält. Ein auf der Politik-Iteration basierendes Verfahren wird vorgestellt und die Konvergenz gegen eine optimale Politik wird bewiesen. Wir zeigen einige Eigenschaften des Paares gekoppelter Funktionalgleichungen, die in diesem Modell auftreten. Zum Schluß werden noch die Formulierungen dieses Entscheidungsmodelles als primales und duales lineares Optimierungsproblem angegeben.
  相似文献   

2.
This article includes an empirical study of the housing market using the statistical method of Markov Process. The first phase of the study is devoted to measuring the filtering process in a selected neighborhood by estimating probabilities of transition from one income group to another, over the period 1949–1969 using four-year intervals. The estimated transition probabilities are then used to forecast occupancy structure for different periods and the suitability of applying the Markov Process for long term policy analysis in housing is examined. The final phase of the study includes an examination of steady state occupancy structure by various income categories of household. The study indicates a fruitful application of the Markov Process in long term housing policy analysis.  相似文献   

3.
The following optimality principle is established for finite undiscounted or discounted Markov decision processes: If a policy is (gain, bias, or discounted) optimal in one state, it is also optimal for all states reachable from this state using this policy. The optimality principle is used constructively to demonstrate the existence of a policy that is optimal in every state, and then to derive the coupled functional equations satisfied by the optimal return vectors. This reverses the usual sequence, where one first establishes (via policy iteration or linear programming) the solvability of the coupled functional equations, and then shows that the solution is indeed the optimal return vector and that the maximizing policy for the functional equations is optimal for every state.  相似文献   

4.
5.
Fitting the value function in a Markovian decision process by a linear superposition of M basis functions reduces the problem dimensionality from the number of states down to M, with good accuracy retained if the value function is a smooth function of its argument, the state vector. This paper provides, for both the discounted and undiscounted cases, three algorithms for computing the coefficients in the linear superposition: linear programming, policy iteration, and least squares.  相似文献   

6.
In this paper, we study variational inequalities in a real Hilbert space, which are governed by a strongly monotone and Lipschitz continuous operator F over a closed and convex set C. We assume that the set C can be outerly approximated by the fixed point sets of a sequence of certain quasi-nonexpansive operators called cutters. We propose an iterative method, the main idea of which is to project at each step onto a particular half-space constructed using the input data. Our approach is based on a method presented by Fukushima in 1986, which has recently been extended by several authors. In the present paper, we establish strong convergence in Hilbert space. We emphasize that to the best of our knowledge, Fukushima’s method has so far been considered only in the Euclidean setting with different conditions on F. We provide several examples for the case where C is the common fixed point set of a finite number of cutters with numerical illustrations of our theoretical results.  相似文献   

7.
Summary A method of successive approximations for discountedMarkovian decision problems is described byMacQueen [1966]. This paper presents a set of methods includingMacQueen's improved version of the standard dynamic programming iterative scheme. While furthermore, by the fact that we used a somewhat different approach, the physical meaning of some aspects of the successive approximation methods will probably be more transparent. Some numerical results will be given.
Zusammenfassung Für diskontierteMarkoff-Entscheidungsprozesse hatMacQueen [1966] eine Methode der sukzessiven Annäherung beschrieben. Die vorliegende Arbeit stellt einen Satz von Methoden vor, derMacQueens verbesserte Version des iterativen Schemas der klassischen Dynamischen Programmierung enthält. Darüber hinaus wird über den hier gewählten unterschiedlichen Ansatz versucht, die praktische Bedeutung einiger Aspekte der Methode der sukzessiven Annäherung transparenter zu machen. Einige numerische Beispiele werden vorgestellt.
  相似文献   

8.
The Least-Squares Monte Carlo Method (LSM) has become the standard tool to solve real options modeled as an optimal switching problem. The method has been shown to deliver accurate valuation results under complex and high dimensional stochastic processes; however, the accuracy of the underlying decision policy is not guaranteed. For instance, an inappropriate choice of regression functions can lead to noisy estimates of the optimal switching boundaries or even continuation/switching regions that are not clearly separated. As an alternative to estimate these boundaries, we formulate a simulation-based method that starts from an initial guess of them and then iterates until reaching optimality. The algorithm is applied to a classical mine under a wide variety of underlying dynamics for the commodity price process. The method is first validated under a one-dimensional geometric Brownian motion and then extended to general Markovian processes. We consider two general specifications: a two-factor model with stochastic variance and a rich jump structure, and a four-factor model with stochastic cost-of-carry and stochastic volatility. The method is shown to be robust, stable, and easy-to-implement, converging to a more profitable strategy than the one obtained with LSM.  相似文献   

9.
10.
11.
In this paper we study the continuous-time Markov decision processes with a denumerable state space, a Borel action space, and unbounded transition and cost rates. The optimality criterion to be considered is the finite-horizon expected total cost criterion. Under the suitable conditions, we propose a finite approximation for the approximate computations of an optimal policy and the value function, and obtain the corresponding error estimations. Furthermore, our main results are illustrated with a controlled birth and death system.  相似文献   

12.
Continuous time Markovian decision models with countable state space are investigated. The existence of an optimal stationary policy is established for the expected average return criterion function. It is shown that the expected average return can be expressed as an expected discounted return of a related Markovian decision process. A policy iteration method is given which converges to an optimal deterministic policy, the policy so obtained is shown optimal over all Markov policies.  相似文献   

13.
14.
In this paper we discuss the discrete time non-homogeneous discounted Markovian decision programming, where the state space and all action sets are countable. Suppose that the optimum value function is finite. We give the necessary and sufficient conditions for the existence of an optimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give the necessary and sufficient conditions for the existence of an optimal policy.  相似文献   

15.
The use of the method of the Euler-Jacobi equation is considered in the study of a quadratic functional defined on a cone. Such functionals occur in the variation of optimal-control problems. Several concepts are introduced with the aid of which the Euler-Jacobi equation is extended and the application of this method is justified also in the case that the equation is not a linear differential equation.Translated from Matematicheskie Zametki, Vol. 20, No. 6, pp. 847–858, December, 1976.  相似文献   

16.
A survey is given of theories of singularities of systems of rays and wave fronts, that is, singularities of systems of extremals of variational problems and solutions of the Hamilton-Jacobi equations near caustics. The problem of passing about an obstacle bounded by a smooth surface of general position is studied in detail. Theorems are proved on the normal forms of Lagrangian manifolds with singularities formed by rays of the system of extremals of a variational problem in the symplectic space of all oriented lines which tear off from the surface of the obstacle as well as theorems on Legendre manifolds with singularities formed by contact elements of a wave front and 1-jets of a solution of the Hamilton-Jacobi equation.Translated from Itogi Nauki i Tekhniki, Seriya Sovremennye Problemy Matematiki, Vol. 22, pp. 3–55, 1983.  相似文献   

17.
In this work we study the necessary and sufficient conditions for a positive random variable whose expectation under the Wiener measure is one, to be represented as the Radon-Nikodym derivative of the image of the Wiener measure under an adapted perturbation of identity with the help of the associated innovation process. We prove that the innovation conjecture holds if and only if the original process is almost surely invertible. We also give variational characterizations of the invertibility of the perturbations of identity and the representability of a positive random variable whose total mass is equal to unity. We prove in particular that an adapted perturbation of identity U=IW+u satisfying the Girsanov theorem, is invertible if and only if the kinetic energy of u is equal to the entropy of the measure induced with the action of U on the Wiener measure μ, in other words U is invertible iff
  相似文献   

18.
We consider the minimizing risk problems in discounted Markov decisions processes with countable state space and bounded general rewards. We characterize optimal values for finite and infinite horizon cases and give two sufficient conditions for the existence of an optimal policy in an infinite horizon case. These conditions are closely connected with Lemma 3 in White (1993), which is not correct as Wu and Lin (1999) point out. We obtain a condition for the lemma to be true, under which we show that there is an optimal policy. Under another condition we show that an optimal value is a unique solution to some optimality equation and there is an optimal policy on a transient set.  相似文献   

19.
We propose a new approach to accelerate the convergence of the modified policy iteration method for Markov decision processes with the total expected discounted reward. In the new policy iteration an additional operator is applied to the iterate generated by Markov operator, resulting in a bigger improvement in each iteration.  相似文献   

20.
We develop a stochastic calculus on the plane with respect to the local times of a large class of Lévy processes. We can then extend to these Lévy processes an Itô formula that was established previously for Brownian motion. Our method provides also a multidimensional version of the formula. We show that this formula generates many “Itô formulas” that fit various problems. In the special case of a linear Brownian motion, we recover a recently established Itô formula that involves local times on curves. This formula is already used in financial mathematics.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号