首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Iwamoto recently established a formal transformation via an invariant imbedding to construct a controlled Markov chain that can be solved in a backward manner, as in backward induction for finite-horizon Markov decision processes (MDPs), for a given controlled Markov chain with non-additive forward recursive objective function criterion. Chang et al. presented formal methods, called “parallel rollout” and “policy switching,” of combining given multiple policies in MDPs and showed that the policies generated by both methods improve all of the policies that the methods combine. This brief paper extends the methods of parallel rollout and policy switching for forward recursive objective function criteria and shows that the similar property holds as in MDPs. We further discuss how to implement these methods via simulation.  相似文献   

2.
This paper studies an optimization problem in which the objective function can not be completely given in closed form. In particular, we assume that some part of the objective function must be computed by an approximation process. This paper develops a technique for solving a class of such problems. Examples demonstrating the technique and problem areas in which it has been successfully applied are also given.  相似文献   

3.
Investigating the inverse problem of the classical Markowitz mean-variance formulation: Given a mean-variance pair, find initial investment levels and their corresponding portfolio policies such that the given mean-variance pair can be realized, we reveal that any mean-variance pair inside the reachable region can be achieved by multiple portfolio policies associated with different initial investment levels. Therefore, in the mean-variance world for a market of all risky assets, the common belief of monotonicity: ‘The larger you invest, the larger expected future wealth you can expect for a given risk (variance) level’ does not hold, which stimulates us to extend the classical two-objective mean-variance framework to an expanded three-objective framework: to maximize the mean and minimize the variance of the final wealth as well as to minimize the initial investment level. As a result, we eliminate from the policy candidate list the set of pseudo efficient policies that are efficient in the original mean-variance space, but inefficient in this newly introduced three-dimensional objective space.  相似文献   

4.
We consider a class of problems concerned with maximizing probabilities, given stage-wise targets, which generalizes the standard threshold probability problem in Markov decision processes. The objective function is the probability that, at all stages, the associatively combined accumulation of rewards earned up to that point takes its value in a specified stage-wise interval. It is shown that this class reduces to the case of the nonnegative-valued multiplicative criterion through an invariant imbedding technique. We derive a recursive formula for the optimal value function and an effective method for obtaining the optimal policies.  相似文献   

5.
This paper suggests a new method for generating the Pareto front in multi-objective Markov chains, which overcomes some existing drawbacks in multi-objective methods: a fundamental issue is to find strong Pareto policies which are policies whose cost-function value is the closest in Euclidean norm to the utopian point. Each strong Pareto policy is reached when each cost-function, constrained by the strategy of others, cannot improve further its own criterion. Constraints associated to the objective function are implemented formulating the problem as a bi-level optimization approach. We convert the problem into a single level optimization approach by introducing a generalized Lagrangian function to represent the original multi-objective problem in terms of a related nonlinear programming problem. Then, we apply the Tikhonov regularization method to the objective function. The regularization method ensures that all the possible Pareto policies to be generated along the Pareto front are strong Pareto policies. For solving the problem we employ the extra-proximal method. The method effectively approximates to every optimal Pareto point, which in this case is a strong Pareto point, in the Pareto front. The experimental result, applied to the route selection for counter-kidnapping problem, validates the effectiveness and usefulness of the method.  相似文献   

6.
Numerous scheduling policies are designed to differentiate quality of service for different applications. Service differentiation can in fact be formulated as a generalized resource allocation optimization towards the minimization of some important system characteristics. For complex scheduling policies, however, optimization can be a demanding task, due to the difficult analytical analysis of the system at hand. In this paper, we study the optimization problem in a queueing system with two traffic classes, a work-conserving parameterized scheduling policy, and an objective function that is a convex combination of either linear, convex or concave increasing functions of given performance measures of both classes. In case of linear and concave functions, we show that the optimum is always in an extreme value of the parameter. Furthermore, we prove that this is not necessarily the case for convex functions; in this case, a unique local minimum exists. This information greatly simplifies the optimization problem. We apply the framework to some interesting scheduling policies, such as Generalized Processor Sharing and semi-preemptive priority scheduling. We also show that the well-documented \(c\mu \)-rule is a special case of our framework.  相似文献   

7.
In this work, we present an algorithm for solving constrained optimization problems that does not make explicit use of the objective function derivatives. The algorithm mixes an inexact restoration framework with filter techniques, where the forbidden regions can be given by the flat or slanting filter rule. Each iteration is decomposed into two independent phases: a feasibility phase which reduces an infeasibility measure without evaluations of the objective function, and an optimality phase which reduces the objective function value. As the derivatives of the objective function are not available, the optimality step is computed by derivative-free trust-region internal iterations. Any technique to construct the trust-region models can be used since the gradient of the model is a reasonable approximation of the gradient of the objective function at the current point. Assuming this and classical assumptions, we prove that the full steps are efficient in the sense that near a feasible nonstationary point, the decrease in the objective function is relatively large, ensuring the global convergence results of the algorithm. Numerical experiments show the effectiveness of the proposed method.  相似文献   

8.
The focus of this paper is the optimization of complex multi-parameter systems. We consider systems in which the objective function is not known explicitly, and can only be evaluated through computationally intensive numerical simulation or through costly physical experiments. The objective function may also contain many local extrema which may be of interest. Given objective function values at a scattered set of parameter values, we develop a response surface model that can dramatically reduce the required computation time for parameter optimization runs. The response surface model is developed using radial basis functions, producing a model whose objective function values match those of the original system at all sampled data points. Interpolation to any other point is easily accomplished and generates a model which represents the system over the entire parameter space. This paper presents the details of the use of radial basis functions to transform scattered data points, obtained from a complex continuum mechanics simulation of explosive materials, into a response surface model of a function over the given parameter space. Response surface methodology and radial basis functions are discussed in general and are applied to a global optimization problem for an explosive oil well penetrator.  相似文献   

9.
We study the inverse optimization problem in the following formulation: given a family of parametrized optimization problems and a real number called demand, determine for which values of parameters the optimal value of the objective function equals to the demand. We formulate general questions and problems about the optimal parameter set and the optimal value function. Then we turn our attention to the case of linear programming, when parameters can be selected from given intervals (“inverse interval LP”). We prove that the problem is NP-hard not only in general, but even in a very special case. We inspect three special cases—the case when parameters appear in the right-hand sides, the case when parameters appear in the objective function, and the case when parameters appear in both the right-hand sides and the objective function. We design a technique based on parametric programming, which allows us to inspect the optimal parameter set. We illustrate the theory by examples.  相似文献   

10.
The set-up cost and yield variability are given and fixed in existing production/inventory models with random yields. However, in many practical situations, they can be reduced by investment in modern production technology. In this paper, we consider an inventory system with random yield in which both the set-up cost and yield variability can be reduced through capital investment. The objective is to determine the optimal capital investment and ordering policies that minimize the expected total annual costs for the system. In addition, an iterative solution procedure is presented to find the optimal order quantity and reorder point and then the optimal set-up cost and yield standard deviation. Numerical examples are given to illustrate the results obtained and assess the cost savings by adopting capital investments. Managerial implications are also included.  相似文献   

11.
In many countries, forest policies consist of a system of various regulations, taxes, and subsidies. In this article, we focus on those policies that regulate selective harvesting and study the example of Central Africa. We use a deterministic singular optimal control model of renewable resources to assess these policies with respect to a first best situation which integrates a social surplus or externality function. In particular, in contrast to earlier articles, we analyze a stock dependent tax, for which the objective function is piecewise differentiable. We use a theorem proposed by Hartl and Feichtinger to solve the mathematical problem. We show that this tax is the most flexible instrument with respect to fund collection.  相似文献   

12.
The multivariate discrete moment problem (MDMP) has been introduced by Prékopa. The objective of the MDMP is to find the minimum and/or maximum of the expected value of a function of a random vector with a discrete finite support where the probability distribution is unknown, but some of the moments are given. The MDMP can be formulated as a linear programming problem, however, the coefficient matrix is very ill-conditioned. Hence, the LP problem usually cannot be solved in a regular way. In the univariate case Prékopa developed a numerically stable dual method for the solution. It is based on the knowledge of all dual feasible bases under some conditions on the objective function. In the multidimensional case the recent results are also about the dual feasible basis structures. Unfortunately, at higher dimensions, the whole structure has not been found under any circumstances. This means that a dual method, similar to Prékopa??s, cannot be developed. Only bounds on the objective function value are given, which can be far from the optimum. This paper introduces a different approach to treat the numerical difficulties. The method is based on multivariate polynomial bases. Our algorithm, in most cases, yields the optimum of the MDMP without any assumption on the objective function. The efficiency of the method is tested on several numerical examples.  相似文献   

13.
A new approach is proposed for the maximization of profit by optimal scheduling of machinery. Only one objective function (profit) is used instead of two (availability and cost). The latter approach inevitably resulted in suboptimization. In addition, the "single objective function" approach naturally lends itself to comparisons of efficiency between any preventive maintenance policies. Optimal solutions were found in order to compare the efficiency of the commonly used policies of age and block replacement. Numerical results show that age replacement is always more profitable. Optimal solutions for these two maintenance policies were also found in the specific case where a maintenance repair is superior in quality to a breakdown repair. Finally, the physical law of increasing entropy, applied to the failure rate concept, leads to the conclusion that preventive maintenance should always be considered.  相似文献   

14.
This paper is concerned with the dynamic assignment of servers to tasks in queueing networks where demand may exceed the capacity for service. The objective is to maximize the system throughput. We use fluid limit analysis to show that several quantities of interest, namely the maximum possible throughput, the maximum throughput for a given arrival rate, the minimum arrival rate that will yield a desired feasible throughput, and the optimal allocations of servers to classes for a given arrival rate and desired throughput, can be computed by solving linear programming problems. We develop generalized round-robin policies for assigning servers to classes for a given arrival rate and desired throughput, and show that our policies achieve the desired throughput as long as this throughput is feasible for the arrival rate. We conclude with numerical examples that illustrate the points discussed and provide insights into the system behavior when the arrival rate deviates from the one the system is designed for.  相似文献   

15.
We examine a sequential selection problem in which a single option must be selected. Each option's value is a function of its attributes, whose precise values can be ascertained at a given cost. We prove the optimality of a threshold stopping rule for a general class of objective functions.  相似文献   

16.
We present a fully polynomial time approximation scheme (FPTAS) for optimizing a very general class of non-linear functions of low rank over a polytope. Our approximation scheme relies on constructing an approximate Pareto-optimal front of the linear functions which constitute the given low-rank function. In contrast to existing results in the literature, our approximation scheme does not require the assumption of quasi-concavity on the objective function. For the special case of quasi-concave function minimization, we give an alternative FPTAS, which always returns a solution which is an extreme point of the polytope. Our technique can also be used to obtain an FPTAS for combinatorial optimization problems with non-linear objective functions, for example when the objective is a product of a fixed number of linear functions. We also show that it is not possible to approximate the minimum of a general concave function over the unit hypercube to within any factor, unless P = NP. We prove this by showing a similar hardness of approximation result for supermodular function minimization, a result that may be of independent interest.  相似文献   

17.
一类模糊线性规划模型的模糊最优区间值   总被引:2,自引:0,他引:2  
讨论一类既有模糊不等式约束又有模糊等式约束的全模糊系数线性规划问题。在给定的模糊隶属度水平下 ,将模型转化为区间数线性规划模型 ,通过确定区间模型的最佳目标函数和最大可行域以及最劣目标函数和最小可行域 ,求出目标函数的模糊最优区间值 ,从而为决策者提供更多的决策信息。最后给出一个数值例子。  相似文献   

18.
This paper considers an optimisation problem encountered in the implementation of traffic policies on network routers, namely the ordering of rules in an access control list to minimise or reduce processing time and hence packet latency. The problem is formulated as an objective function with constraints and shown to be NP-complete by translation to a known problem. Exact and heuristic solution methods are introduced, discussed and compared and computational results given. The emphasis throughout is on practical implementation of the optimisation process, that is within the tight constraints of a production network router seeking to reduce latency, on-line, in real-time but without the overhead of significant extra computation.  相似文献   

19.
随机规划中的一些逼近结果   总被引:1,自引:0,他引:1  
主要讨论了一类随机规划的目标函数分别在概率测度序列分布收敛、函数序列上图收敛以及随机变量序列均方可积收敛等收敛意义下目标函数序列的收敛情况。基于上述收敛情况给出了一些逼近思想,这些思想可应用于求解这类随机规划问题。  相似文献   

20.
Continuous time Markovian decision models with countable state space are investigated. The existence of an optimal stationary policy is established for the expected average return criterion function. It is shown that the expected average return can be expressed as an expected discounted return of a related Markovian decision process. A policy iteration method is given which converges to an optimal deterministic policy, the policy so obtained is shown optimal over all Markov policies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号