Transfer Learning for Operator Selection: A Reinforcement Learning Approach

Rafet Durgut, Mehmet Emin Aydin, Abdur Rakib

Research output: Working paper/PreprintPreprint

Abstract

In the past two decades, metaheuristic optimization algorithms (MOAs) have been increasingly popular, particularly in logistic, science, and engineering problems. The fundamental characteristics of such algorithms are that they are dependent on a parameter or a strategy. Some online and offline strategies are employed in order to obtain optimal configurations of the algorithms. Adaptive operator selection is one of them, and it determines whether or not to update a strategy from the strategy pool during the search process. In the filed of machine learning, Reinforcement Learning (RL) refers to goal-oriented algorithms, which learn from the environment how to achieve a goal. On MOAs, reinforcement learning has been utilised to control the operator selection process. Existing research, however, fails to show that learned information may be transferred from one problem-solving procedure to another. The primary goal of the proposed research is to determine the impact of transfer learning on RL and MOAs. As a test problem, a set union knapsack problem with 30 separate benchmark problem instances is used. The results are statistically compared in depth. The learning process, according to the findings, improved the convergence speed while significantly reducing the CPU time.
Original languageEnglish
PublisherPreprints.org
DOIs
Publication statusPublished - 21 Dec 2021
Externally publishedYes

Bibliographical note

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Keywords

  • Transfer learning
  • Reinforcement learning
  • Adaptive operator selection
  • Artificial bee colony

Fingerprint

Dive into the research topics of 'Transfer Learning for Operator Selection: A Reinforcement Learning Approach'. Together they form a unique fingerprint.

Cite this