- Tytuł:
- Learning board evaluation function for Othello by hybridizing coevolution with temporal difference learning
- Autorzy:
-
Szubert, M.
Jaśkowski, W.
Krawiec, K. - Powiązania:
- https://bibliotekanauki.pl/articles/206175.pdf
- Data publikacji:
- 2011
- Wydawca:
- Polska Akademia Nauk. Instytut Badań Systemowych PAN
- Tematy:
-
evolutionary computation
coevolutionary algorithms
reinforcement learning
memetic computing
game strategy learning - Opis:
- Hybridization of global and local search techniques has already produced promising results in the fields of optimization and machine learning. It is commonly presumed that approaches employing this idea, like memetic algorithms combining evolutionary algorithms and local search, benefit from complementarity of constituent methods and maintain the right balance between exploration and exploitation of the search space. While such extensions of evolutionary algorithms have been intensively studied, hybrids of local search with coevolutionary algorithms have not received much attention. In this paper we attempt to fill this gap by presenting Coevolutionary Temporal Difference Learning (CTDL) that works by interlacing global search provided by competitive coevolution and local search by means of temporal difference learning. We verify CTDL by applying it to the board game of Othello, where it learns board evaluation functions represented by a linear architecture of weighted piece counter. The results of a computational experiment show CTDL superiority compared to coevolutionary algorithm and temporal difference learning alone, both in terms of performance of elaborated strategies and computational cost. To further exploit CTDL potential, we extend it by an archive that keeps track of selected well-performing solutions found so far and uses them to improve search convergence. The overall conclusion is that the fusion of various forms of coevolution with a gradient-based local search can be highly beneficial and deserves further study.
- Źródło:
-
Control and Cybernetics; 2011, 40, 3; 805-831
0324-8569 - Pojawia się w:
- Control and Cybernetics
- Dostawca treści:
- Biblioteka Nauki