首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
Accelerating autonomous learning by using heuristic selection of actions   总被引:2,自引:0,他引:2  
This paper investigates how to make improved action selection for online policy learning in robotic scenarios using reinforcement learning (RL) algorithms. Since finding control policies using any RL algorithm can be very time consuming, we propose to combine RL algorithms with heuristic functions for selecting promising actions during the learning process. With this aim, we investigate the use of heuristics for increasing the rate of convergence of RL algorithms and contribute with a new learning algorithm, Heuristically Accelerated Q-learning (HAQL), which incorporates heuristics for action selection to the Q-Learning algorithm. Experimental results on robot navigation show that the use of even very simple heuristic functions results in significant performance enhancement of the learning rate.  相似文献   

2.
We consider bi-criteria optimization problems for decision rules and rule systems relative to length and coverage. We study decision tables with many-valued decisions in which each row is associated with a set of decisions as well as single-valued decisions where each row has a single decision. Short rules are more understandable; rules covering more rows are more general. Both of these problems—minimization of length and maximization of coverage of rules are NP-hard. We create dynamic programming algorithms which can find the minimum length and the maximum coverage of rules, and can construct the set of Pareto optimal points for the corresponding bi-criteria optimization problem. This approach is applicable for medium-sized decision tables. However, the considered approach allows us to evaluate the quality of various heuristics for decision rule construction which are applicable for relatively big datasets. We can evaluate these heuristics from the point of view of (i) single-criterion—we can compare the length or coverage of rules constructed by heuristics; and (ii) bi-criteria—we can measure the distance of a point (length, coverage) corresponding to a heuristic from the set of Pareto optimal points. The presented results show that the best heuristics from the point of view of bi-criteria optimization are not always the best ones from the point of view of single-criterion optimization.  相似文献   

3.
In this paper, we present a random iterative graph based hyper-heuristic to produce a collection of heuristic sequences to construct solutions of different quality. These heuristic sequences can be seen as dynamic hybridisations of different graph colouring heuristics that construct solutions step by step. Based on these sequences, we statistically analyse the way in which graph colouring heuristics are automatically hybridised. This, to our knowledge, represents a new direction in hyper-heuristic research. It is observed that spending the search effort on hybridising Largest Weighted Degree with Saturation Degree at the early stage of solution construction tends to generate high quality solutions. Based on these observations, an iterative hybrid approach is developed to adaptively hybridise these two graph colouring heuristics at different stages of solution construction. The overall aim here is to automate the heuristic design process, which draws upon an emerging research theme on developing computer methods to design and adapt heuristics automatically. Experimental results on benchmark exam timetabling and graph colouring problems demonstrate the effectiveness and generality of this adaptive hybrid approach compared with previous methods on automatically generating and adapting heuristics. Indeed, we also show that the approach is competitive with the state of the art human produced methods.  相似文献   

4.
This paper presents an investigation of a simple generic hyper-heuristic approach upon a set of widely used constructive heuristics (graph coloring heuristics) in timetabling. Within the hyper-heuristic framework, a tabu search approach is employed to search for permutations of graph heuristics which are used for constructing timetables in exam and course timetabling problems. This underpins a multi-stage hyper-heuristic where the tabu search employs permutations upon a different number of graph heuristics in two stages. We study this graph-based hyper-heuristic approach within the context of exploring fundamental issues concerning the search space of the hyper-heuristic (the heuristic space) and the solution space. Such issues have not been addressed in other hyper-heuristic research. These approaches are tested on both exam and course benchmark timetabling problems and are compared with the fine-tuned bespoke state-of-the-art approaches. The results are within the range of the best results reported in the literature. The approach described here represents a significantly more generally applicable approach than the current state of the art in the literature. Future work will extend this hyper-heuristic framework by employing methodologies which are applicable on a wider range of timetabling and scheduling problems.  相似文献   

5.
In this paper we investigate the use of hyper-heuristic methodologies for predicting DNA sequences. In particular, we utilize Sequencing by Hybridization. We believe that this is the first time that hyper-heuristics have been investigated in this domain. A hyper-heuristic is provided with a set of low-level heuristics and the aim is to decide which heuristic to call at each decision point. We investigate three types of hyper-heuristics. Two of these (simulated annealing and tabu search) draw their inspiration from meta-heuristics. The choice function hyper-heuristic draws its inspiration from reinforcement learning. We utilize two independent sets of low-level heuristics. The first set is based on a previous tabu search method, with the second set being a significant extension to this basic set, including utilizing a different representation and introducing the definition of clusters. The datasets we use comprises two randomly generated datasets and also a publicly available biological dataset. In total, we carried out experiments using 70 different combinations of heuristics, using the three datasets mentioned above and investigating six different hyper-heuristic algorithms. Our results demonstrate the effectiveness of a hyper-heuristic approach to this problem domain. It is necessary to provide a good set of low-level heuristics, which are able to both intensify and diversify the search but this approach has demonstrated very encouraging results on this extremely difficult and important problem domain.  相似文献   

6.
A Tabu-Search Hyperheuristic for Timetabling and Rostering   总被引:4,自引:0,他引:4  
Hyperheuristics can be defined to be heuristics which choose between heuristics in order to solve a given optimisation problem. The main motivation behind the development of such approaches is the goal of developing automated scheduling methods which are not restricted to one problem. In this paper we report the investigation of a hyperheuristic approach and evaluate it on various instances of two distinct timetabling and rostering problems. In the framework of our hyperheuristic approach, heuristics compete using rules based on the principles of reinforcement learning. A tabu list of heuristics is also maintained which prevents certain heuristics from being chosen at certain times during the search. We demonstrate that this tabu-search hyperheuristic is an easily re-usable method which can produce solutions of at least acceptable quality across a variety of problems and instances. In effect the proposed method is capable of producing solutions that are competitive with those obtained using state-of-the-art problem-specific techniques for the problems studied here, but is fundamentally more general than those techniques.  相似文献   

7.
The complete topology design problem of survivable mesh-based transport networks is to address simultaneously design of network topology, working path routing, and spare capacity allocation based on span-restoration. Each constituent problem in the complete design problem could be formulated as an Integer Programming (IP) and is proved to be NP\mathcal{NP} -hard. Due to a large amount of decision variables and constraints involved in the IP formulation, to solve the problem directly by exact algorithms (e.g. branch-and-bound) would be impractical if not impossible. In this paper, we present a two-level evolutionary approach to address the complete topology design problem. In the low-level, two parameterized greedy heuristics are developed to jointly construct feasible solutions (i.e., closed graph topologies satisfying all the mesh-based network survivable constraints) of the complete problem. Unlike existing “zoom-in”-based heuristics in which subsets of the constraints are considered, the proposed heuristics take all constraints into account. An estimation of distribution algorithm works on the top of the heuristics to tune the control parameters. As a result, optimal solution to the considered problem is more likely to be constructed from the heuristics with the optimal control parameters. The proposed algorithm is evaluated experimentally in comparison with the latest heuristics based on the IP software CPLEX, and the “zoom-in”-based approach on 28 test networks problems. The experimental results demonstrate that the proposed algorithm is more effective in finding high-quality topologies than the IP-based heuristic algorithm in 21 out of 28 test instances with much less computational costs, and performs significantly better than the “zoom-in”-based approach in 19 instances with the same computational costs.  相似文献   

8.
The reliability-redundancy allocation problem is an optimization problem that achieves better system reliability by determining levels of component redundancies and reliabilities simultaneously. The problem is classified with the hardest problems in the reliability optimization field because the decision variables are mixed-integer and the system reliability function is nonlinear, non-separable, and non-convex. Thus, iterative heuristics are highly recommended for solving the problem due to their reasonable solution quality and relatively short computation time. At present, most iterative heuristics use sensitivity factors to select an appropriate variable which significantly improves the system reliability. The sensitivity factor represents the impact amount of each variable to the system reliability at a designated iteration. However, these heuristics are inefficient in terms of solution quality and computation time because the sensitivity factor calculations are performed only at integer variables. It results in degradation of the exploration and growth in the number of subsequent continuous nonlinear programming (NLP) subproblems. To overcome the drawbacks of existing iterative heuristics, we propose a new scaling method based on the multi-path iterative heuristics introduced by Ha (2004). The scaling method is able to compute sensitivity factors for all decision variables and results in a decreased number of NLP subproblems. In addition, the approximation heuristic for NLP subproblems helps to avoid redundant computation of NLP subproblems caused by outlined solution candidates. Numerical experimental results show that the proposed heuristic is superior to the best existing heuristic in terms of solution quality and computation time.  相似文献   

9.
A review of the literature on heuristics would suggest two approaches to their use in problem solving: mathematical and engineering. It is suggested however that there is a third approach for real world application, which the authors have called relational. Instead of investigating a problem through the medium of mathematical models and then deriving heuristics because direct optimising techniques are not available, it is advocated that a close relationship between problem owner and problem solver can be achieved by setting down together the decision rules that the owner employs. In this way, a heuristic model is developed directly, with the opportunity to introduce additional procedures as the situation allows. The result is a consistent control mechanism which is invaluable for both strategic and operational decision making. The model can be a predictor of the effects of policy decisions as well as a means by which those decisions can be implemented and monitored, dependent as much upon the balancing of sometimes conflicting objectives by management as upon the setting of bounds to achieve a guaranteed performance.  相似文献   

10.
A Metaheuristic to Solve a Location-Routing Problem with Non-Linear Costs   总被引:1,自引:0,他引:1  
The paper deals with a location-routing problem with non-linear cost functions. To the best of our knowledge, a mixed integer linear programming formulation for the addressed problem is proposed here for the first time. Since the problem is NP-hard exact algorithms are able to solve only particular cases, thus to solve more general versions heuristics are needed. The algorithm proposed in this paper is a combination of a p-median approach to find an initial feasible solution and a metaheuristic to improve the solution. It is a hybrid metaheuristic merging Variable Neighborhood Search (VNS) and Tabu Search (TS) principles and exploiting the synergies between the two. Computational results and conclusions close the paper.  相似文献   

11.
This research studies the problem of batching orders in a dynamic, finite-horizon environment to minimize order tardiness and overtime costs of the pickers. The problem introduces the following trade-off: at every period, the picker has to decide whether to go on a tour and pick the accumulated orders, or to wait for more orders to arrive. By waiting, the picker risks higher tardiness of existing orders on the account of lower tardiness of future orders. We use a Markov decision process (MDP) based approach to set an optimal decision making policy. In order to evaluate the potential improvement of the proposed approach in practice, we compare the optimal policy with two naïve heuristics: (1) “Go on tour immediately after an order arrives”, and, (2) “Wait as long as the current orders can be picked and supplied on time”. The optimal policy shows a considerable improvement over the naïve heuristics, in the range of 7–99%, where the specific values depend on the picking process parameters. We have found that one measure, the slack percentage of the picking process, associated with the difference between the promised lead time and the single item picking time, predicts quite accurately the cost reduction generated by the optimal policy. Since relatively small-scale problems could be solved by the optimal algorithm, a heuristic was developed, based on the structure and properties of the optimal solutions. Numerical results show that the proposed heuristic, MDP-H, outperforms the naïve heuristics in all experiments. As compared to the optimal solution, MDP-H provides close to optimal results for a slack of up to 40%.  相似文献   

12.
In this paper we analyse the performance of flowshop sequencing heuristics with respect to the objectives of makespan and flowtime minimisation. For flowtime minimisation, we propose the strategy employed by the NEH heuristic to construct partial solutions. Results show that this approach outperforms the common fast heuristics for flowtime minimisation while performing similarly or slightly worse than others which, on reward, prove to be much more CPU time-consuming. Additionally, the suggested approach is well balanced with respect to makespan and flowtime minimisation. Based on the previous results, two algorithms are proposed for the sequencing problem with multiple objectives – makespan and flowtime minimisation. These algorithms provide the decision maker with a set of heuristically efficient solutions such that he/she may choose the most suitable sequence for a given ratio between costs associated with makespan and those assigned to flowtime. Computational experience shows both algorithms to perform better than the current heuristics designed for the two-criteria problem.  相似文献   

13.
We introduce a novel variant of the travelling salesmen problem and propose a hyper-heuristic methodology in order to solve it. In a competitive travelling salesmen problem (CTSP), m travelling salesmen are to visit n cities and the relationship between the travelling salesmen is non-cooperative. The salesmen will receive a payoff if they are the first one to visit a city and they pay a cost for any distance travelled. The objective of each salesman is to visit as many unvisited cities as possible, with a minimum travelling distance. Due to the competitive element, each salesman needs to consider the tours of other salesman when planning their own tour. Since equilibrium analysis is difficult in the CTSP, a hyper-heuristic methodology is developed. The model assumes that each agent adopts a heuristic (or set of heuristics) to choose their moves (or tour) and each agent knows that the moves/tours of all agents are not necessarily optimal. The hyper-heuristic consists of a number of low-level heuristics, each of which can be used to create a move/tour given the heuristics of the other agents, together with a high-level heuristic that is used to select from the low-level heuristics at each decision point. Several computational examples are given to illustrate the effectiveness of the proposed approach.  相似文献   

14.
Bin-oriented heuristics for one-dimensional bin-packing problem construct solutions by packing one bin at a time. Several such heuristics consider two or more subsets for each bin and pack the one with the largest total weight. These heuristics sometimes generate poor solutions, due to a tendency to use many small items early in the process. To address this problem, we propose a method of controlling the average weight of items packed by bin-oriented heuristics. Constructive heuristics and an improvement heuristic based on this approach are introduced. Additionally, reduction methods for bin-oriented heuristics are presented. The results of an extensive computational study show that: (1) controlling average weight significantly improves solutions and reduces computation time of bin-oriented heuristics; (2) reduction methods improve solutions and processing times of some bin-oriented heuristics; and (3) the new improvement heuristic outperforms all other known complex heuristics, in terms of both average solution quality and computation time.  相似文献   

15.
This paper proposes two constructive heuristics for the well-known single-level uncapacitated dynamic lot-sizing problem. The proposed heuristics, called net least period cost (nLPC) and nLPC(i), are developed by modifying the average period cost concept from Silver and Meal's heuristic, commonly known as least period cost (LPC). An improved tie-breaking stopping rule and a locally optimal decision rule are proposed in the second heuristic to enhance performance. We test the effectiveness of the proposed heuristics by using 20 benchmarking test problems frequently used in the literature. Furthermore, we perform a large-scale simulation study involving three factors, 50 experimental conditions, and 100?000 randomly generated problems to evaluate the proposed heuristics against LPC and six other well-known constructive heuristics in the literature. The simulation results show that both nLPC and nLPC(i) produce average holding and setup costs lower than or equal to those of LPC in every one of the 50 experimental conditions. The proposed heuristics also outperform each of the six other heuristics evaluated in all experimental conditions, without an increase in computational requirements. Lastly, considering that both nLPC and nLPC(i) are fairly simple for practitioners to understand and that lot-sizing heuristics have been commonly used in practice, there should be a very good chance for practical applications of the proposed heuristics.  相似文献   

16.
This study considers decisions in workforce management assuming individual workers are inherently different as measured by general cognitive ability (GCA). A mixed integer programming (MIP) model that determines different staffing decisions (i.e., hire, cross-train, and fire) in order to minimize workforce related costs over multiple periods is described. Solving the MIP for a large problem instance size is computationally burdensome. In this paper, two linear programming (LP) based heuristics and a solution space partition approach are presented to reduce the computational time. A genetic algorithm was also implemented as an alternative method to obtain better solutions and for comparison to the heuristics proposed. The heuristics were applied to realistic manufacturing systems with a large number of machine groups. Experimental results shows that performance of the LP based heuristics performance are surprisingly good and indicate that the heuristics can solve large problem instances effectively with reasonable computational effort.  相似文献   

17.
In this paper we propose and evaluate an evolutionary-based hyper-heuristic approach, called EH-DVRP, for solving hard instances of the dynamic vehicle routing problem. A hyper-heuristic is a high-level algorithm, which generates or chooses a set of low-level heuristics in a common framework, to solve the problem at hand. In our collaborative framework, we have included three different types of low-level heuristics: constructive, perturbative, and noise heuristics. Basically, the hyper-heuristic manages and evolves a sophisticated sequence of combinations of these low-level heuristics, which are sequentially applied in order to construct and improve partial solutions, i.e., partial routes. In presenting some design considerations, we have taken into account the allowance of a proper cooperation and communication among low-level heuristics, and as a result, find the most promising sequence to tackle partial states of the (dynamic) problem. Our approach has been evaluated using the Kilby’s benchmarks, which comprise a large number of instances with different topologies and degrees of dynamism, and we have compared it with some well-known methods proposed in the literature. The experimental results have shown that, due to the dynamic nature of the hyper-heuristic, our proposed approach is able to adapt to dynamic scenarios more naturally than low-level heuristics. Furthermore, the hyper-heuristic can obtain high-quality solutions when compared with other (meta) heuristic-based methods. Therefore, the findings of this contribution justify the employment of hyper-heuristic techniques in such changing environments, and we believe that further contributions could be successfully proposed in related dynamic problems.  相似文献   

18.
This paper studies heuristics for the minimum labelling spanning tree (MLST) problem. The purpose is to find a spanning tree using edges that are as similar as possible. Given an undirected labelled connected graph, the minimum labelling spanning tree problem seeks a spanning tree whose edges have the smallest number of distinct labels. This problem has been shown to be NP-hard. A Greedy Randomized Adaptive Search Procedure (GRASP) and a Variable Neighbourhood Search (VNS) are proposed in this paper. They are compared with other algorithms recommended in the literature: the Modified Genetic Algorithm and the Pilot Method. Nonparametric statistical tests show that the heuristics based on GRASP and VNS outperform the other algorithms tested. Furthermore, a comparison with the results provided by an exact approach shows that we may quickly obtain optimal or near-optimal solutions with the proposed heuristics.  相似文献   

19.
A vital task facing government agencies and commercial organizations that report data is to represent the data in a meaningful way and simultaneously to protect the confidentiality of critical components of this data. The challenge is to organize and disseminate data in a form that prevents such critical components from being inferred by groups bent on corporate espionage, to gain competitive advantages, or having a desire to penetrate the security of the information underlying the data. Controlled tabular adjustment is a recently developed approach for protecting sensitive information by imposing a special form of statistical disclosure limitation on tabular data. The underlying model gives rise to a mixed integer linear programming problem involving both continuous and discrete (zero-one) variables. We develop stratified ordered (s-ordered) heuristics and a new meta-heuristic learning approach for solving this model, and compare their performance to previous heuristics and to an exact algorithm embodied in the state-of-the-art ILOG- CPLEX software. Our new approaches are based on partitioning the problem into its discrete and continuous components, first creating an s-ordered heuristic that reduces the number of binary variables through a grouping procedure that combines an exact mathematical programming model with constructive heuristics. To gain further advantages we then replace the mathematical programming model with an evolutionary scatter search approach that makes it possible to extend the method to large problems with over 9000 entries. Finally, we introduce a new metaheuristic learning method that significantly improves the quality of solutions obtained.  相似文献   

20.
Modern technology is succeeding in delivering more information to people at ever faster rates. Under traditional views of rational decision making where individuals should evaluate and combine all available evidence, more information will yield better decisions. But our minds are designed to work in environments where information is often costly and difficult to obtain, leading us to use simple fast and frugal heuristics when making many decisions. These heuristics typically ignore most of the available information and rely on only a few important cues. Yet they make choices that are accurate in their appropriate application domains, achieving ecological rationality through their fit to particular information structures. This paper presents four classes of simple heuristics that use limited information—recognition-based heuristics, one-reason decision mechanisms, multiple-cue elimination strategies, and quick sequential search mechanisms—applied to environments from stock market investment to judging intentions of other organisms to choosing a mate. The findings that ecological rationality can be achieved with limited information are also used to indicate how our mind’s design, relying on decision mechanisms tuned to specific environments, should be taken into account in our technology’s design, creating environments that can enable better decisions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号