期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

方晓伟田蔚文张黎东姜爱萍《应用数学与计算数学学报》2003,17(1):78-84

在文[1]中给出了求解多目标最优化的一种积分总极值的概念性算法．本文利用数论中的一致分布佳点集列，较为简便的得出了多目标最优化的积分总极值的实现算法和算法终止准则．并经过有关函数数值计算表明该算法是有效的，可用来求解多目标最优化问题的有效解．相似文献

2.

平面点列的自动光顺算法 总被引：2，自引：0，他引：2

杨勋年汪国昭《高校应用数学学报(A辑)》1998,(Z1)

本文考虑平面点列的光顺问题并将该问题化成最小能量曲线的构成问题，即在原点列和相应允许误差构成的带状区域内构造一条最小能量曲线并给出一种自动算法．整个光顺过程分成两步，第一步利用凸分析原理在原点列的允许变动范围内除去多余拐点；第二步在保凸的前提下构造插值点列的最小能量曲线并通过对最小能量曲线进行修正而达到对原型值点列进行光顺的目的．光顺结果不仅可以得到一光顺点列，同时还得到了一条插值点列的光顺曲线．该方法可以对分布不均匀甚至有较大转角的点列进行光顺，与已有的方法比起来具有光顺能力强光顺范围广的特点．相似文献

3.

张量方法在信赖域算法中的应用

魏淑云王传伟《运筹学学报》2008,12(4)

信赖域算法是求解无约束优化问题的一种有效的算法.对于该算法的子问题,本文将原来目标函数的二次模型扩展成四次张量模型,提出了一个带信赖域约束的四次张量模型优化问题的求解算法.该方法的最大特点是:不仅在张量模型的非稳定点可以得到下降方向及相应的迭代步长,而且在非局部极小值点的稳定点也可以得到下降方向及相应的迭代步长,从而在算法产生的迭代点列中存在一个子列收敛到信赖域子问题的局部极小值点. 相似文献

4.

连续铸钢的最优控制数学模型(英)

林宙辰石济民《应用数学》1998,(1)

本文为连续铸钢建立了较为实际的最优控制模型．应用背景是控制钢的冷却过程以保证钢的质量．通过模型，问题被转化为求最优的热交换系数使得某个目标泛函达到极小．状态方程用相松弛法求解．通过引入共轭状态方程，可求得该目标泛函的梯度，然后按Armijo的框架设计了优化算法．数值试验表明优化效果令人满意．在最后一节，改进了原算法，使得优化效率大大提高．相似文献

5.

基于Multi-Agent的机构投资者行为投资组合模型

姜继娇杨乃定王良《运筹学学报》2006,10(3):114-120

从人的有限理性角度研究了机构投资者的投资组合决策问题．基于Multi-Agent构建了多心理账户情景下,机构投资者的两级行为投资组合模型；并且,利用两状态Markov链和管理熵函数描述了该模型中的关键参数；仿真算法释例验证了该模型能够逼近实际决策情景．相似文献

6.

多目标决策的中心方法之统一探讨

施保昌陈珽祝世京《数学物理学报(A辑)》1997,(Z1)

该文从统一的角度研究了多目标决策的中心方法的结构及其收敛性质．提出了形式一般的、可采用三种曲线搜索规则的中心方法之算法模型并在很弱的条件下证明了其全局收敛性以此为基础．讨论了模型中的搜索方向等参量的取法，给出了两类可实现的算法．该文结果统一和推广了已有的单（多）目标决策的中心方法．数值结果表明该算法是有效的．相似文献

7.

基于锥模型的一般信赖域算法收敛性分析 总被引：8，自引：0，他引：8

李正峰《系统科学与数学》1998,18(2):124-252

本文给出了锥模型信赖域算法的一般模型，它不仅包含通常的信赖域算法一相当于锥模型算法中bk＝0的情形，而且文献[1]的算法也可看作其子类．我们研究这个模型的较强的全局收敛性，并讨论保证算法具有超线性收敛速率的条件，从而推广了文[1]和文[4]中的若干结果．相似文献

8.

三状态串-并联系统的优化分配方法

修春周天宠《应用泛函分析学报》2014,(2):117-120

在单目标、单约束下,建立了三状态串-并联系统的优化模型,采用选取重要部件的方法优化系统可靠度,并相应地给出优化算法,最后通过例子,验证了该算法的有效性．相似文献

9.

关于NA列的一类小参数问题

王岳宝苏淳《应用数学》1998,11(4):80-84

本文在更为广泛的权函数和边界函数的范围内，使用不高于独立列场合下的矩条件，讨论了NA列的一类小参数级数的极限状态及收敛速度．因此，本文的结果即使对独立列也是有意义的．相似文献

10.

微分方程(组)对称向量的吴-微分特征列算法及其应用 总被引：9，自引：0，他引：9

朝鲁《数学物理学报(A辑)》1999,19(3):326-332

给出（偏）微分方程（组）（PDEs）对称向量的吴-微分特征列集（消元）算法理论．把古典和非古典PDEs对称问量的计算问题统-在吴-微分特征列理论框架之下处理．给出了产生PDEs对称向量的无穷小方程和验证已知向量为PDES对称向量的机械化原理,理论上彻底克服了传统算法中的缺陷并为计算PDEs对称向量提供了一种新算法．用计算机代数系统mathematica编制了相应的软件包,具体实现了该算法．作为应用给出了Burgers方程的非古典对称向量的完整解答．相似文献

11.

Deviation Matrix,Laurent Series and Blackwell Optimality in Countable State Markov Decision Processes

《Optimization》2012,61(1):191-202

This paper presents a recurrent condition on Markov decision processes with a countable state space and bounded rewards. The condition is sufficient for the existence of a Blackwell optimal stationary policy, having the Laurent series expansion with continuous coefficients. It is so relaxed that the Markov chain corresponding to a stationary policy may have countably many periodic recurrent classes. Our method finds the deviation matrix in an explicit form. 相似文献

12.

Nonstationary denumerable state Markov decision processes – with average variance criterion

Xianping Guo 《Mathematical Methods of Operations Research》1999,49(1):87-96

In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion. 相似文献

13.

Finding optimal memoryless policies of POMDPs under the expected average reward criterion

Yanjie Li Baoqun Yin 《European Journal of Operational Research》2011,211(3):556-567

In this paper, partially observable Markov decision processes (POMDPs) with discrete state and action space under the average reward criterion are considered from a recent-developed sensitivity point of view. By analyzing the average-reward performance difference formula, we propose a policy iteration algorithm with step sizes to obtain an optimal or local optimal memoryless policy. This algorithm improves the policy along the same direction as the policy iteration does and suitable step sizes guarantee the convergence of the algorithm. Moreover, the algorithm can be used in Markov decision processes (MDPs) with correlated actions. Two numerical examples are provided to illustrate the applicability of the algorithm. 相似文献

14.

Finding the K best policies in a finite-horizon Markov decision process

《European Journal of Operational Research》2006,175(2):1164-1179

Directed hypergraphs represent a general modelling and algorithmic tool, which have been successfully used in many different research areas such as artificial intelligence, database systems, fuzzy systems, propositional logic and transportation networks. However, modelling Markov decision processes using directed hypergraphs has not yet been considered.In this paper we consider finite-horizon Markov decision processes (MDPs) with finite state and action space and present an algorithm for finding the K best deterministic Markov policies. That is, we are interested in ranking the first K deterministic Markov policies in non-decreasing order using an additive criterion of optimality. The algorithm uses a directed hypergraph to model the finite-horizon MDP. It is shown that the problem of finding the optimal policy can be formulated as a minimum weight hyperpath problem and be solved in linear time, with respect to the input data representing the MDP, using different additive optimality criteria. 相似文献

15.

Continuous time Markovian decision processes average return criterion

Prasadarao Kakumanu 《Journal of Mathematical Analysis and Applications》1975,52(1):173-188

Continuous time Markovian decision models with countable state space are investigated. The existence of an optimal stationary policy is established for the expected average return criterion function. It is shown that the expected average return can be expressed as an expected discounted return of a related Markovian decision process. A policy iteration method is given which converges to an optimal deterministic policy, the policy so obtained is shown optimal over all Markov policies. 相似文献

16.

A structured pattern matrix algorithm for multichain Markov decision processes

Tetsuichiro Iki Masayuki Horiguchi Masami Kurano 《Mathematical Methods of Operations Research》2007,66(3):545-555

In this paper, we are concerned with a new algorithm for multichain finite state Markov decision processes which finds an average optimal policy through the decomposition of the state space into some communicating classes and a transient class. For each communicating class, a relatively optimal policy is found, which is used to find an optimal policy by applying the value iteration algorithm. Using a pattern matrix determining the behaviour pattern of the decision process, the decomposition of the state space is effectively done, so that the proposed algorithm simplifies the structured one given by the excellent Leizarowitz’s paper (Math Oper Res 28:553–586, 2003). Also, a numerical example is given to comprehend the algorithm. 相似文献

17.

Equivalence of Lyapunov stability criteria in a class of Markov decision processes

Rolando Cavazos-Cadena Onésimo Hernández-Lerma 《Applied Mathematics and Optimization》1992,26(2):113-137

We are concerned with Markov decision processes with countable state space and discrete-time parameter. The main structural restriction on the model is the following: under the action of any stationary policy the state space is acommunicating class. In this context, we prove the equivalence of ten stability/ergodicity conditions on the transition law of the model, which imply the existence of average optimal stationary policies for an arbitrary continuous and bounded reward function; these conditions include the Lyapunov function condition (LFC) introduced by A. Hordijk. As a consequence of our results, the LFC is proved to be equivalent to the following: under the action of any stationary policy the corresponding Markov chain has a unique invariant distribution which depends continuously on the stationary policy being used. A weak form of the latter condition was used by one of the authors to establish the existence of optimal stationary policies using an approach based on renewal theory.This research was supported in part by the Third World Academy of Sciences (TWAS) under Grant TWAS RG MP 898-152. 相似文献

18.

报酬函数及转移速率族均非一致有界的连续时间折扣马氏决策规划

伍从斌《应用数学学报》1997,20(2):196-208

本文首次在报酬函数及转移速率族均非一致有界的条件下，对可数状态空间，可地动集的连续时间折扣马氏决策规划进行研究，文中引入一类新的无界报酬函数，在一类新的马氏策略中，讨论了最优策略的存在性及春结构，除证明了在有界报酬和一致有界转移速率族下成立的主要结果外，本文还得到一些重要结论。相似文献

19.

Continuous time shock markov decision processes with discounted criterion

《Optimization》2012,61(2-3):271-283

This paper presents a new concept of Markov decision processes: continuous time shock Markov decision processes, which model Markovian controlled systems sequentially shocked by its environment. Between two adjacent shocks, the system can be modeled by continuous time Markov decision processes. But according to each shock, the system's parameters are changed and an instantaneous state transition occurs. After presenting the model, we prove that the optimality equation, which consists of countable equations, has a unique solution in some function space Ω 相似文献

20.

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Arie Hordijk Alexander A. Yushkevich 《Mathematical Methods of Operations Research》1999,49(1):1-39

This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards. 相似文献