首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper considers the solution of Markov decision problems whose parameters can be obtained only via approximating schemes, or where it is computationally preferable to approximate the parameters, rather than employing exact algorithms for their computation.Various models are presented in which this situation occurs. Furthermore, it is shown that a modified value-iteration method may be employed, both for the discounted version and for the undiscounted version of the model, in order to solve the optimality equation and to find optimal policies. In both cases, the convergence rate is determined.As a side result, we characterize the asymptotic behavior of backward products of a geometrically convergent sequence of Markov matrices.  相似文献   

2.
We show that the LP formulation for an undiscounted multi-chain Markov decision problem can be put in a block upper-triangular form by a polynomial time procedure. Each minimal block (after an appropriate dynamic revision) gives rise to a single-chain Markov decision problem which can be treated independently. An optimal solution to each single-chain problem can be connected by auxiliary dual programs to obtain an optimal solution to a multi-chain problem.  相似文献   

3.
We study hemispaces (i,e., convex sets with convex complements) in Rn. We give several geometric characterizations of hemispaces and several ways of representing them with the aid of linear operators and lexicographical order. We obtain a metric-affine classification of hemispaces, in terms of their “rank” and “type,” and a “decomposition theorem.” We also give some characterizations of affine transformations which preserve a hemispace.  相似文献   

4.
This paper proposes a value iteration method which finds an-optimal policy of an undiscounted multichain Markov decision process in a finite number of iterations. The undiscounted multichain Markov decision process is reduced to an aggregated Markov decision process, which utilizes maximal gains of undiscounted Markov decision sub-processes and is formulated as an optimal stopping problem. As a preliminary, sufficient conditions are presented under which a policy is-optimal.
Zusammenfassung In dieser Arbeit wird eine Wertiterationsmethode vorgeschlagen, die eine-optimale Politik für einen undiskontierten nicht-irreduziblen Markovschen Entscheidungsprozeß (MEP) in endlichen vielen Schritten liefert. Der undiskontierte nicht-irreduzible MEP wird auf einen aggregierten MEP reduziert, der maximale Gewinn eines undiskontierten Sub-MEP verwendet und als optimales Stopp-Problem formuliert wird. Zu Beginn werden hinreichende Bedingungen für die-Optimalität einer Politik angegeben.
  相似文献   

5.
We give several characterizations of Banach lattices on which each positive Dunford-Pettis operator is compact. As consequences, we obtain new sufficient and necessary conditions under which a norm of a Banach lattice is order continuous, a positive weakly compact operator is compact and the dual operator of a positive Dunford-Pettis operator is Dunford-Pettis.  相似文献   

6.
《Optimization》2012,61(4):773-800
Abstract

In this paper we study the risk-sensitive average cost criterion for continuous-time Markov decision processes in the class of all randomized Markov policies. The state space is a denumerable set, and the cost and transition rates are allowed to be unbounded. Under the suitable conditions, we establish the optimality equation of the auxiliary risk-sensitive first passage optimization problem and obtain the properties of the corresponding optimal value function. Then by a technique of constructing the appropriate approximating sequences of the cost and transition rates and employing the results on the auxiliary optimization problem, we show the existence of a solution to the risk-sensitive average optimality inequality and develop a new approach called the risk-sensitive average optimality inequality approach to prove the existence of an optimal deterministic stationary policy. Furthermore, we give some sufficient conditions for the verification of the simultaneous Doeblin condition, use a controlled birth and death system to illustrate our conditions and provide an example for which the risk-sensitive average optimality strict inequality occurs.  相似文献   

7.
We give a unified method to obtain the conservativeness of a class of Markov processes associated with lower bounded semi-Dirichlet forms on L 2(X;m), including symmetric diffusion processes, some non-symmetric diffusion processes and jump type Markov processes on X, where X is a locally compact separable metric space and m is a positive Radon measure on X with full topological support. Using the method, we give an example in each section, providing the conservativeness of the processes, that are given by the “increasingness of the volume of some sets(balls)” and “that of the coefficients on the sets” of the Markov processes.  相似文献   

8.
Mihai Popescu  Fernand Pelletier 《PAMM》2007,7(1):2060071-2060072
In this work we study the trajectories which are tangent to an affine sub-bundle in the tangent bundle of a manifold and which minimize the “total energy”.We give some characterizations of such “regular” trajectories in terms of control theory and geometrical theory. We also build some sufficient conditions of existence for such curves. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

9.
《Optimization》2012,61(3-4):385-392
In the steady state of an undiscounted Markov decision process, we consider the problem to find an optimal stationary probability distribution that maximizes the mean standard deviation ratio among all the stationary probability distributions. The problem injects considerations in MDPs from the relative point of view  相似文献   

10.
In this paper, we first give some characterizations for P-class functions. Then giving a Hermite–Hadamard type inequality for P-class functions, we prove equivalency of some significant metrics in normed linear spaces. We also obtain an operator version of the Jensen inequality for P-class functions. Introducing operator (mid) P-class functions, we present some characterizations for such functions.  相似文献   

11.
The operator square root of the Laplacian (? ?)1/2 can be obtained from the harmonic extension problem to the upper half space as the operator that maps the Dirichlet boundary condition to the Neumann condition. In this article, we obtain similar characterizations for general fractional powers of the Laplacian and other integro-differential operators. From those characterizations we derive some properties of these integro-differential equations from purely local arguments in the extension problems.  相似文献   

12.
We study eigenvalue problems Fy = λ Gy consisting of Hamiltonian systems of ordinary differential equations on a compact interval with symmetric λ-linear boundary conditions. The problems we are interested in are non-definite: neither left-nor right-definite. Instead of this, we give some weak condition on one coefficient of the Hamiltonian system which ensures that a hermitian form associated with the operator F has at most finitely many negative squares. This enables us to study the problem by the help of a compact self-adjoint operator in a Pontrjagin space and we obtain as a main result uniformly convergent eigenfunction expansions. In the final section, applications to formally self-adjoint differential equations of higher order are given.  相似文献   

13.
We obtain sufficient conditions for a “holomorphic” semigroup of unbounded operators to possess a boundary group of bounded operators. The theorem is applied to generalize to unbounded operators results of Kantorovitz about the similarity of certain perturbations. Our theory includes a result of Fisher on the Riemann-Liouville semigroup in Lp(0, ∞) 1 < p < ∞. In this particular case we give also an alternative approach, where the boundary group is obtained as the limit of groups in the weak operator topology.  相似文献   

14.
Whitney’s 2-switching theorem states that any two embeddings of a 2-connected planar graph in S 2 can be connected via a sequence of simple operations, named 2-switching. In this paper, we obtain two operations on planar graphs from the view point of knot theory, which we will term “twisting” and “2-switching” respectively. With the twisting operation, we give a pure geometrical proof of Whitney’s 2-switching theorem. As an application, we obtain some relationships between two knots which correspond to the same signed planar graph. Besides, we also give a necessary and sufficient condition to test whether a pair of reduced alternating diagrams are mutants of each other by their signed planar graphs.  相似文献   

15.
This note describes sufficient conditions under which total-cost and average-cost Markov decision processes (MDPs) with general state and action spaces, and with weakly continuous transition probabilities, can be reduced to discounted MDPs. For undiscounted problems, these reductions imply the validity of optimality equations and the existence of stationary optimal policies. The reductions also provide methods for computing optimal policies. The results are applied to a capacitated inventory control problem with fixed costs and lost sales.  相似文献   

16.
We study the rate of convergence of a sequence of linear operators that converges pointwise to a linear operator. Our main interest is in characterizing the slowest type of pointwise convergence possible. This is a continuation of the paper Deutsch and Hundal (2010) [14]. The main result is a “lethargy” theorem (Theorem 3.3) which gives useful conditions that guarantee arbitrarily slow convergence. In the particular case when the sequence of linear operators is generated by the powers of a single linear operator, we obtain a “dichotomy” theorem, which states the surprising result that either there is linear (fast) convergence or arbitrarily slow convergence; no other type of convergence is possible. The dichotomy theorem is applied to generalize and sharpen: (1) the von Neumann–Halperin cyclic projections theorem, (2) the rate of convergence for intermittently (i.e., “almost” randomly) ordered projections, and (3) a theorem of Xu and Zikatanov.  相似文献   

17.
18.
In this paper, we give a necessary and sufficient condition that a locally biholomorphic mapping f on the unit ball B in a complex Hilbert space X is a biholomorphic convex mapping, which improves some results of Hamada and Kohr and solves the problem which is posed by Graham and Kohr. From this, we derive some sufficient conditions for biholomorphic convex mapping. We also introduce a linear operator in purpose to construct some concrete examples of biholomorphic convex mappings on B in Hilbert spaces. Moreover, we give some examples of biholomorphic convex mappings on B in Hilbert spaces.  相似文献   

19.
For a given multi-objective optimization problem, we introduce and study the notion of α-proper efficiency. We give two characterizations of such proper efficiency: one is in terms of exact penalization and the other is in terms of stability of associated parametric problems. Applying the aforementioned characterizations and recent results on global error bounds for inequality systems, we obtain verifiable conditions for α-proper efficiency. For a large class of polynomial multi-objective optimization problems, we show that any efficient solution is α-properly efficient under some mild conditions. For a convex quadratically constrained multi-objective optimization problem with convex quadratic objective functions, we show that any efficient solution is α-properly efficient with a known estimate on α whenever its constraint set is bounded. Finally, we illustrate our achieved results with examples, and give an example to show that such an enhanced efficiency property may not hold for multi-objective optimization problems involving C -functions as objective functions.  相似文献   

20.
The well-known Hammersley–Clifford Theorem states (under certain conditions) that any Markov random field is a Gibbs state for a nearest neighbor interaction. In this paper we study Markov random fields for which the proof of the Hammersley–Clifford Theorem does not apply. Following Petersen and Schmidt we utilize the formalism of cocycles for the homoclinic equivalence relation and introduce “Markov cocycles”, reparametrizations of Markov specifications. The main part of this paper exploits this to deduce the conclusion of the Hammersley–Clifford Theorem for a family of Markov random fields which are outside the theorem’s purview where the underlying graph is Zd. This family includes all Markov random fields whose support is the d-dimensional “3-colored chessboard”. On the other extreme, we construct a family of shift-invariant Markov random fields which are not given by any finite range shift-invariant interaction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号