改進AlphaZero的大贏策略並應用於黑白棋

張乃元; Chang, Nai-Yuan

改進AlphaZero的大贏策略並應用於黑白棋

dc.contributor	林順喜	zh_TW
dc.contributor	Lin, Shun-Shii	en_US
dc.contributor.author	張乃元	zh_TW
dc.contributor.author	Chang, Nai-Yuan	en_US
dc.date.accessioned	2019-09-05T11:15:45Z
dc.date.available	2021-07-25
dc.date.available	2019-09-05T11:15:45Z
dc.date.issued	2019
dc.description.abstract	DeepMind的AlphaZero演算法在電腦遊戲對局領域中取得了巨大的成功，在許多具有挑戰性的遊戲中都取得了超越人類的表現，但是我們認為AlphaZero演算法中仍然有可以改進的地方。 AlphaZero演算法只估計遊戲的輸贏或是平手，而忽略了最後可能會獲得多少分數。而在像是圍棋或是黑白棋這類的佔地型遊戲中，最後所得到的分數往往會相當大地左右遊戲的勝負，於是我們提出大贏策略：在AlphaZero演算法中加入對於分數的判斷，來改進演算法的效率。在本研究中使用8路黑白棋作為實驗大贏策略效果的遊戲，我們使用並且修改網路上一個實作AlphaZero演算法的開源專案：alpha-zero-general來進行我們的實驗。經過我們的實驗之後，使用大贏策略的模型相比未使用的原始AlphaZero模型，在經過100個迭代的訓練之後有著高達78%的勝率，證明大贏策略對於AlphaZero演算法有著十分顯著的改進效益。	zh_TW
dc.description.abstract	DeepMind's AlphaZero algorithm has achieved great success in the field of computer game, and has surpassed human performance in many challenging games, but we believe there still has some point for improvement in the AlphaZero algorithm. The AlphaZero algorithm only estimates whether the game wins or loses, and ignores how many points may be obtained in the end. In a land-based game like Go or Othello, the final score will tend to be quite a big game. So we propose Big Win Strategy: add the judgment of the score in the AlphaZero algorithm. To improve the efficiency of the algorithm. In this paper, we used 8x8 Othello as the game for the Big Win Strategy. We used and modified an open source project on the Internet that implemented the AlphaZero algorithm: alpha-zero-general for our experiments. After our experiments, the model using the Big Win Strategy has a winning rate of 78% after 100 iterations compared to the original AlphaZero model, which proves that the Big Win Strategy has significant improvement benefits for the AlphaZero algorithm.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	G060647066S
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060647066S%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106524
dc.language	中文
dc.subject	電腦對局	zh_TW
dc.subject	黑白棋	zh_TW
dc.subject	蒙地卡羅法	zh_TW
dc.subject	神經網路	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	computer game	en_US
dc.subject	othello	en_US
dc.subject	Monte Carlo	en_US
dc.subject	neural network	en_US
dc.subject	deep learning	en_US
dc.subject	AlphaZero	en_US
dc.title	改進AlphaZero的大贏策略並應用於黑白棋	zh_TW
dc.title	The Big Win Strategy: An improvement over AlphaZero approach for Othello	en_US

Collections

學位論文

改進AlphaZero的大贏策略並應用於黑白棋

Files

Collections