首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A controlled system under dynamic disturbances is considered. We formulate the problem of finding a strategy that is optimal in the sense of Savage’s minimax risk (regret) criterion, list basic properties of such problems, and describe a construction of a strategy optimal in the above sense for one class of systems containing linear systems.  相似文献   

2.
Stability is fundamental to ensure the operation of control system, but optimality is the ultimate goal to achieve the maximum performance. This paper investigates an event-triggered pinning optimal consensus control for switched multi-agent system (SMAS) via a switched adaptive dynamic programming (ADP) method. The technical contribution mainly lies in two aspects. On the one hand, in order to optimize the control performance and ensure the consensus, the switched local value function (SLVF) and the minimum-error switching law are constructed. Based on SLVF, an algorithm of switched ADP policy iteration is proposed, and its convergence and optimality are proved. On the other hand, considering that it is impractical to install a controller for each agent in reality, a pinning strategy is developed to guide the setting of the ADP controller, which can reduce the waste of control resources. A new condition is constructed to determine the minimum number of controlled vertices of the SMAS. Lastly, a numerical example is given to verify the effectiveness of the proposed method.  相似文献   

3.
This paper investigates a dynamic event-triggered optimal control problem of discrete-time (DT) nonlinear Markov jump systems (MJSs) via exploring policy iteration (PI) adaptive dynamic programming (ADP) algorithms. The performance index function (PIF) defined in each subsystem is updated by utilizing an online PI algorithm, and the corresponding control policy is derived via solving the optimal PIF. Then, we adopt neural network (NN) techniques, including an actor network and a critic network, to estimate the iterative PIF and control policy. Moreover, the designed dynamic event-triggered mechanism (DETM) is employed to avoid wasting additional resources when the estimated iterative control policy is updated. Finally, based on the Lyapunov difference method, it is proved that the system stability and the convergence of all signals can be guaranteed under the developed control scheme. A simulation example for DT nonlinear MJSs with two system modes is presented to demonstrate the feasibility of the control design scheme.  相似文献   

4.
Delsarte’s method and its extensions allow one to consider the upper bound problem for codes in two-point homogeneous spaces as a linear programming problem with perhaps infinitely many variables, which are the distance distribution. We show that using as variables power sums of distances, this problem can be considered as a finite semidefinite programming problem. This method allows one to improve some linear programming upper bounds. In particular, we obtain new bounds of one-sided kissing numbers.  相似文献   

5.
In this paper the control of discrete chaotic systems by designing linear feedback controllers is presented. The linear feedback control problem for nonlinear systems has been formulated under the viewpoint of dynamic programming. For suppressing chaos with minimum control effort, the system is stabilized on its first order unstable fixed point (UFP). The presented method also could be employed to make any desired nth order fixed point of the system, stable. Two different methods for higher order UFPs stabilization are suggested. Afterwards, these methods are applied to two well-known chaotic discrete systems: the Logistic and the Henon Maps. For each of them, the first and second UFPs in their chaotic regions are stabilized and simulation results are provided for the demonstration of performance.  相似文献   

6.
7.
We consider linear programming problems with uncertain objective function coefficients. For each coefficient of the objective function, an interval of uncertainty is known, and it is assumed that any coefficient can take on any value from the corresponding interval of uncertainty, regardless of the values taken by other coefficients. It is required to find a minmax regret solution. This problem received considerable attention in the recent literature, but its computational complexity status remained unknown. We prove that the problem is strongly NP-hard. This gives the first known example of a minmax regret optimization problem that is NP-hard in the case of interval-data representation of uncertainty but is polynomially solvable in the case of discrete-scenario representation of uncertainty.  相似文献   

8.
9.
Negative dynamic programming for risk-sensitive control is studied. Under some compactness and semicontinuity assumptions the following results are proved: the convergence of the value iteration algorithm to the optimal expected total reward, the Borel measurability or upper semicontinuity of the optimal value functions, and the existence of an optimal stationary policy.  相似文献   

10.
Summary We consider a general finite stage dynamic programming model. Bounds are derived for the approximation of the minimum expected total cost and of the optimal policy. The theory is applied to an inventory model to give bounds for good order policies.
Zusammenfassung Es wird ein allgemeines dynamisches Optimierungsmodell mit endlichem Horizont betrachtet. Für verschiedene Näherungsverfahren für die minimalen erwarteten Gesamtkosten und die optimale Politik werden Schranken angegeben. Die Theorie wird sodann auf ein Lagerhaltungsmodell angewandt, um Schranken für gute Bestellpolitiken zu erhalten.
  相似文献   

11.
12.
Koole  Ger 《Queueing Systems》1998,30(3-4):323-339
In this paper we study monotonicity results for optimal policies of various queueing and resource sharing models. The standard approach is to propagate, for each specific model, certain properties of the dynamic programming value function. We propose a unified treatment of these models by concentrating on the events and the form of the value function instead of on the value function itself. This is illustrated with the systematic treatment of one and two-dimensional models. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

13.
In this paper we propose a discrete algorithm for a tracking control of a two-wheeled mobile robot (WMR), using an advanced Adaptive Critic Design (ACD). We used Dual-Heuristic Programming (DHP) algorithm, that consists of two parametric structures implemented as Neural Networks (NNs): an actor and a critic, both realized in a form of Random Vector Functional Link (RVFL) NNs. In the proposed algorithm the control system consists of the DHP adaptive critic, a PD controller and a supervisory term, derived from the Lyapunov stability theorem. The supervisory term guaranties a stable realization of a tracking movement in a learning phase of the adaptive critic structure and robustness in face of disturbances. The discrete tracking control algorithm works online, uses the WMR model for a state prediction and does not require a preliminary learning. Verification has been conducted to illustrate the performance of the proposed control algorithm, by a series of experiments on the WMR Pioneer 2-DX.  相似文献   

14.
We present intensional dynamic programming (IDP), a generic framework for structured dynamic programming over atomic, propositional and relational representations of states and actions. We first develop set-based dynamic programming and show its equivalence with classical dynamic programming. We then show how to describe state sets intensionally using any form of structured knowledge representation and obtain a generic algorithm that can optimally solve large, even infinite, MDPs without explicit state space enumeration. We derive two new Bellman backup operators and algorithms. In order to support the view of IDP as a Rosetta stone for structured dynamic programming, we review many existing techniques that employ either propositional or relational knowledge representation frameworks.  相似文献   

15.
The purpose of this paper is to draw a detailed comparison between Newton's method, as applied to discrete-time, unconstrained optimal control problems, and the second-order method known as differential dynamic programming (DDP). The main outcomes of the comparison are: (i) DDP does not coincide with Newton's method, but (ii) the methods are close enough that they have the same convergence rate, namely, quadratic.The comparison also reveals some other facts of theoretical and computational interest. For example, the methods differ only in that Newton's method operates on a linear approximation of the state at a certain point at which DDP operates on the exact value. This would suggest that DDP ought to be more accurate, an anticipation borne out in our computational example. Also, the positive definiteness of the Hessian of the objective function is easy to check within the framework of DDP. This enables one to propose a modification of DDP, so that a descent direction is produced at each iteration, regardless of the Hessian.Efforts of the first author were partially supported by the South African Council for Scientific and Industrial Research, and those of the second author by NSF Grants Nos. CME-79-05010 and CEE-81-10778.  相似文献   

16.
This contribution extends a numerical method for solving optimal control problems by dynamic programming to a class of hybrid dynamic systems with autonomous as well as controlled switching. The value function of the hybrid control system is calculated based on a full discretization of the state and input spaces. A bound for the error due to discretization is obtained from modeling the error as perturbation of the continuous dynamics and the cost terms. It is shown that the bound approaches zero and that the value function of the discretized variant converges to the value function of the original problem if the discretization parameters go to zero. The performance of a numerical scheme exploiting the discretized system is illustrated for two different examples treated previously in literature.  相似文献   

17.
In this paper we study the problem of utility indifference pricing in a constrained financial market, using a utility function defined over the positive real line. We present a convex risk measure −v(•:y) satisfying q(x,F)=x+v(F:u0(x)), where u0(x) is the maximal expected utility of a small investor with the initial wealth x, and q(x,F) is a utility indifference buy price for a European contingent claim with a discounted payoff F. We provide a dynamic programming equation associated with the risk measure (−v), and characterize v as a viscosity solution of this equation.  相似文献   

18.
In intensity-modulated radiotherapy (IMRT), a treatment is designed to deliver high radiation doses to tumors, while avoiding the healthy tissue. Optimization-based treatment planning often produces sharp dose gradients between tumors and healthy tissue. Random shifts during treatment can cause significant differences between the dose in the ??optimized?? plan and the actual dose delivered to a patient. An IMRT treatment plan is delivered as a series of small daily dosages, or fractions, over a period of time (typically 35 days). It has recently become technically possible to measure variations in patient setup and the delivered doses after each fraction. We develop an optimization framework, which exploits the dynamic nature of radiotherapy and information gathering by adapting the treatment plan in response to temporal variations measured during the treatment course of a individual patient. The resulting (suboptimal) control policies, which re-optimize before each fraction, include two approximate dynamic programming schemes: certainty equivalent control (CEC) and open-loop feedback control (OLFC). Computational experiments show that resulting individualized adaptive radiotherapy plans promise to provide a considerable improvement compared to non-adaptive treatment plans, while remaining computationally feasible to implement.  相似文献   

19.
This paper addresses Markov Decision Processes over compact state and action spaces. We investigate the special case of linear dynamics and piecewise-linear and convex immediate costs for the average cost criterion. This model is very general and covers many interesting examples, for instance in inventory management. Due to the curse of dimensionality, the problem is intractable and optimal policies usually cannot be computed, not even for instances of moderate size.  相似文献   

20.
Naive implementations of Newton's method for unconstrainedN-stage discrete-time optimal control problems with Bolza objective functions tend to increase in cost likeN 3 asN increases. However, if the inherent recursive structure of the Bolza problem is properly exploited, the cost of computing a Newton step will increase only linearly withN. The efficient Newton implementation scheme proposed here is similar to Mayne's DDP (differential dynamic programming) method but produces the Newton step exactly, even when the dynamical equations are nonlinear. The proposed scheme is also related to a Riccati treatment of the linear, two-point boundary-value problems that characterize optimal solutions. For discrete-time problems, the dynamic programming approach and the Riccati substitution differ in an interesting way; however, these differences essentially vanish in the continuous-time limit.This work was supported by the National Science Foundation, Grant No. DMS-85-03746.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号