在下棋與訓練階段改進AlphaZero演算法

dc.contributor林順喜zh_TW
dc.contributorLin, Shun-Shiien_US
dc.contributor.author陳志宏zh_TW
dc.contributor.authorChen, Chih-Hungen_US
dc.date.accessioned2022-06-08T02:43:39Z
dc.date.available2024-10-01
dc.date.available2022-06-08T02:43:39Z
dc.date.issued2021
dc.description.abstractnonezh_TW
dc.description.abstractAlphaZero got grand success across many challenging games, but it needs a huge computational power to train a good model. Instead of investing so many resources, we focus on improving the performance of AlphaZero. In this work, we introduce seven major enhancements in AlphaZero. First, the AlphaZero-miniMax Hybrids strategy combines the modern AlphaZero approach and traditional search algorithm to improve the strength of the program. Second, the Proven-mark strategy prunes unneeded moves to avoid the re-sampling problem and increase the opportunity of exploring the promising moves. Third, the Quick-win strategy distinguishes the rewards according to the length of the game-tree search, and no longer treats all wins (or losses) equally. Fourth, the Best-win strategy resolves an inaccurate win rate problem by updating the best reward rather than average. Fifth, the Threat-space-reduction improves the performance of the neural network training under limited resources. Sixth, the Big-win strategy takes into consideration the number of points of the final outcome instead of simply labeling win/loss/draw. Finally, the Multistage-training strategy improves the quality of the neural network for multistage games. After years of work, we derive some promising results that have already improved the performance of the AlphaZero algorithm on some test domains.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifier80047002S-40253
dc.identifier.urihttps://etds.lib.ntnu.edu.tw/thesis/detail/965feca0896757a0aadfca6515239c49/
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw/handle/20.500.12235/117352
dc.language英文
dc.subjectnonezh_TW
dc.subjectAlphaZero-miniMax Hybridsen_US
dc.subjectProven-mark strategyen_US
dc.subjectQuick-win strategyen_US
dc.subjectBest-win strategyen_US
dc.subjectThreat-space-reductionen_US
dc.subjectBig-win strategyen_US
dc.subjectMultistage-training strategyen_US
dc.title在下棋與訓練階段改進AlphaZero演算法zh_TW
dc.titleImproving the AlphaZero Algorithm in the Playing and Training Phasesen_US
dc.type學術論文

Files

Collections