首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Multi-armed bandit-based hyper-heuristics for combinatorial optimization problems
Institution:1. Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago de Chile 7941169, Chile;2. Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Viña del Mar, Chile;3. Innovation and Sustainability Data Lab (ISDaLab), UPF Barcelona School of Management, C. Balmes 132-134, Barcelona 08008, Spain;1. Rutgers Business School, 1 Washington Park, Newark, NJ 07102, USA;2. Department of Systems and Information Engineering, University of Virginia, VA 22903, USA;1. Newcastle Business School, Northumbria University, Newcastle upon Tyne, UK;2. School of Engineering, Newcastle University, Newcastle upon Tyne NE1 7RU, UK;3. Management School, University of Liverpool, Liverpool L69 3BX, UK;4. Business School, Newcastle University, Newcastle upon Tyne NE1 4SE, UK;5. Department of Operations and Information Systems, University of Graz, Graz, Austria;1. Neoma Business School, Mont Saint-Aignan, France;2. Department of Industrial Engineering and Management, American University of Beirut, Beirut, Lebanon;3. Department of Economics & Business, Colorado School of Mines, Golden, CO, 80401, USA;1. EM Normandie Business School, Métis Lab, Clichy 92110, France;2. Tilburg School of Economics and Managemnt, Tilburg University, LE Tilburg 5000, Netherlands;1. ANT/OR - Operations Research Group, Department of Engineering Management, University of Antwerp, Belgium;2. IMT Nord Europe, Institut Mines-Télécom, Univ. Lille, Centre for Digital Systems, Lille F-59000, France;1. STOR-i Centre for Doctoral Training, Lancaster University, Lancaster LA1 4YR, United Kingdom;2. Department of Management Science, Lancaster University, Lancaster LA1 4YX, United Kingdom;3. Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YR, United Kingdom;4. Operations Research Department, Naval Postgraduate School, Monterey, CA 93943, United Kingdom
Abstract:There are significant research opportunities in the integration of Machine Learning (ML) methods and Combinatorial Optimization Problems (COPs). In this work, we focus on metaheuristics to solve COPs that have an important learning component. These algorithms must explore a solution space and learn from the information they obtain in order to find high-quality solutions. Among the metaheuristics, we study Hyper-Heuristics (HHs), algorithms that, given a number of low-level heuristics, iteratively select and apply heuristics to a solution. The HH we consider has a Markov model to produce sequences of low-level heuristics, which we combine with a Multi-Armed Bandit Problem (MAB)-based method to learn its parameters. This work proposes several improvements to the HH metaheuristic that yields a better learning for solving problem instances. Specifically, this is the first work in HHs to present Exponential Weights for Exploration and Exploitation (EXP3) as a learning method, an algorithm that is able to deal with adversarial settings. We also present a case study for the Vehicle Routing Problem with Time Windows (VRPTW), for which we include a list of low-level heuristics that have been proposed in the literature. We show that our algorithms can handle a large and diverse list of heuristics, illustrating that they can be easily configured to solve COPs of different nature. The computational results indicate that our algorithms are competitive methods for the VRPTW (2.16% gap on average with respect to the best known solutions), demonstrating the potential of these algorithms to solve COPs. Finally, we show how algorithms can even detect low-level heuristics that do not contribute to finding better solutions to the problem.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号