Multiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式

蔡宜憲; Tsai, Yi-Sian

Multiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式

dc.contributor	林順喜	zh_TW
dc.contributor	Lin, Shun-Shii	en_US
dc.contributor.author	蔡宜憲	zh_TW
dc.contributor.author	Tsai, Yi-Sian	en_US
dc.date.accessioned	2024-12-17T03:37:26Z
dc.date.available	2025-02-01
dc.date.issued	2024
dc.description.abstract	電腦對局是人工智慧在計算機科學和工程方面的最古老和最著名的應用之一，而AlphaZero在棋類對局領域是一個非常強大的強化學習算法。AlphaZero是用了MCTS與深度神經網路結合的演算法。較大的神經網路在準確評估方面具有優勢，較小的神經網路在成本和效能方面具有優勢，在有限的預算下必須兩者取得平衡。Multiple Policy Value Monte Carlo Tree Search此方法結合了多個不同大小的神經網路，並保留每個神經網路的優勢。本研究以Surag Nair先生在GitHub上的AlphaZero General程式做修改，加入Multiple Policy Value Monte Carlo Tree Search，並實現在連四棋遊戲上。另外在程式中使用了Multiprocessing來加快訓練速度。最後使用了Population Based Training的方式來尋找較佳的超參數。	zh_TW
dc.description.abstract	Computer games are one of the oldest and most famous applications of artificial intelligence in computer science and engineering. AlphaZero is a very powerful reinforcement learning algorithm in the field of computer games.AlphaZero combines the Monte Carlo Tree Search algorithm with deep neural networks. Larger neural networks have advantages in accurate evaluation, while smaller networks have advantages in cost as well as efficiency. Finding a balance between the two is crucial when working with limited budgets. The Multiple Policy Value Monte Carlo Tree Search combines multiple neural networks of different sizes, leveraging the advantages of each network.In this study, we modified the AlphaZero General program written by Surag Nair on GitHub. We implemented the Multiple Policy Value Monte Carlo Tree Search and applied it to the Connect Four game. To accelerate the training process, we employed multiprocessing techniques in the program. Lastly, we used Population Based Training to search for better hyperparameters.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	61047107S-44597
dc.identifier.uri	https://etds.lib.ntnu.edu.tw/thesis/detail/72bd29f612af9bd5295fe758211bc638/
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/123718
dc.language	中文
dc.subject	電腦對局	zh_TW
dc.subject	連四棋	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	AlphaZero	zh_TW
dc.subject	Multiple Policy Value Monte Carlo Tree Search	zh_TW
dc.subject	Computer Games	en_US
dc.subject	Connect Four	en_US
dc.subject	Deep Learning	en_US
dc.subject	AlphaZero	en_US
dc.subject	Multiple Policy Value Monte Carlo Tree Search	en_US
dc.title	Multiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式	zh_TW
dc.title	Enhancing the Connect Four Program Through the Combination of Multiple Policy Value MCTS and Population Based Training	en_US
dc.type	學術論文

Collections

學位論文

Multiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式

Files

Collections