Multiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式

dc.contributor林順喜zh_TW
dc.contributorLin, Shun-Shiien_US
dc.contributor.author蔡宜憲zh_TW
dc.contributor.authorTsai, Yi-Sianen_US
dc.date.accessioned2024-12-17T03:37:26Z
dc.date.available2025-02-01
dc.date.issued2024
dc.description.abstract電腦對局是人工智慧在計算機科學和工程方面的最古老和最著名的應用之一,而AlphaZero在棋類對局領域是一個非常強大的強化學習算法。AlphaZero是用了MCTS與深度神經網路結合的演算法。較大的神經網路在準確評估方面具有優勢,較小的神經網路在成本和效能方面具有優勢,在有限的預算下必須兩者取得平衡。Multiple Policy Value Monte Carlo Tree Search此方法結合了多個不同大小的神經網路,並保留每個神經網路的優勢。本研究以Surag Nair先生在GitHub上的AlphaZero General程式做修改,加入Multiple Policy Value Monte Carlo Tree Search,並實現在連四棋遊戲上。另外在程式中使用了Multiprocessing來加快訓練速度。最後使用了Population Based Training的方式來尋找較佳的超參數。zh_TW
dc.description.abstractComputer games are one of the oldest and most famous applications of artificial intelligence in computer science and engineering. AlphaZero is a very powerful reinforcement learning algorithm in the field of computer games.AlphaZero combines the Monte Carlo Tree Search algorithm with deep neural networks. Larger neural networks have advantages in accurate evaluation, while smaller networks have advantages in cost as well as efficiency. Finding a balance between the two is crucial when working with limited budgets. The Multiple Policy Value Monte Carlo Tree Search combines multiple neural networks of different sizes, leveraging the advantages of each network.In this study, we modified the AlphaZero General program written by Surag Nair on GitHub. We implemented the Multiple Policy Value Monte Carlo Tree Search and applied it to the Connect Four game. To accelerate the training process, we employed multiprocessing techniques in the program. Lastly, we used Population Based Training to search for better hyperparameters.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifier61047107S-44597
dc.identifier.urihttps://etds.lib.ntnu.edu.tw/thesis/detail/72bd29f612af9bd5295fe758211bc638/
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw/handle/20.500.12235/123718
dc.language中文
dc.subject電腦對局zh_TW
dc.subject連四棋zh_TW
dc.subject深度學習zh_TW
dc.subjectAlphaZerozh_TW
dc.subjectMultiple Policy Value Monte Carlo Tree Searchzh_TW
dc.subjectComputer Gamesen_US
dc.subjectConnect Fouren_US
dc.subjectDeep Learningen_US
dc.subjectAlphaZeroen_US
dc.subjectMultiple Policy Value Monte Carlo Tree Searchen_US
dc.titleMultiple Policy Value MCTS 結合 Population Based Training 加強連四棋程式zh_TW
dc.titleEnhancing the Connect Four Program Through the Combination of Multiple Policy Value MCTS and Population Based Trainingen_US
dc.type學術論文

Files

Collections