應用於電腦圍棋之蒙地卡羅樹搜尋法的新啟發式演算法

dc.contributor林順喜zh_TW
dc.contributorRémi Coulomzh_TW
dc.contributorShun-Shii Linen_US
dc.contributorRémi Coulomen_US
dc.contributor.author黃士傑zh_TW
dc.contributor.authorShih-Chieh Huangen_US
dc.date.accessioned2019-09-05T11:47:29Z
dc.date.available2011-7-27
dc.date.available2019-09-05T11:47:29Z
dc.date.issued2011
dc.description.abstract電腦圍棋的研究開始於1970年,但圍棋程式卻從未曾被人們認為是強大的,直到2006年,當「蒙地卡羅樹搜尋」(Monte Carlo Tree Search)與「樹狀結構信賴上界法」(Upper Confidence bounds applied to Trees)出現之後,情況才開始完全不同。「蒙地卡羅樹搜尋」與「樹狀結構信賴上界法」所帶進的革命強而有力到一個地步,人們甚至開始相信,圍棋程式在10年或者20年之後,將能夠擊敗頂尖的人類棋手。 在本研究中,我們針對「蒙地卡羅樹搜尋」提出一些新的啟發式演算法,主要有兩方面的貢獻。第一個貢獻,是成功的將「模擬平衡化」(Simulation Balancing)應用到9路圍棋。「模擬平衡化」是一種用來訓練模擬的參數的演算法。Silver與Tesauro在2009年提出這個方法時,只實驗在比較小的盤面上,而我們的實驗結果首先證明了「模擬平衡化」在9路圍棋的有效性,具體方法是證明「模擬平衡化」超越了知名的監督式演算法Minorization-Maximization (MM)大約有90 Elo之多。第二個貢獻是針對19路圍棋,系統式的實驗了各種不同之時間控制的方法。實驗結果清楚的指明,聰明的時間控制方案可以大大的提高棋力。所有的實驗都是執行在我們的圍棋程式ERICA,而ERICA正是得益於這些啟發式演算法與實驗結果,成功取得了2010年電腦奧林匹亞的19路圍棋金牌。zh_TW
dc.description.abstractResearch into computer Go started around 1970, but the Go-playing programs were never, in a real sense, considered to be strong until the year 2006, when the brand new search scheme Monte Carlo Tree Search (MCTS) and Upper Confidence bounds applied to Trees (UCT) appeared on the scene. The revolution of MCTS and UCT promoted progress of computer Go to such a degree that people began to believe that after ten or twenty years, Go-playing programs will be able to defeat the top human players. In this research, we propose some new heuristics of MCTS focused on two contributions. The first contribution is the successful application of Simulation Balancing (SB), an algorithm for training the parameters of the simulation, to 9×9 Go. SB was proposed by Silver and Tesauro in 2009, but it was only practiced on small board sizes. Our experiments are the first to demonstrate its effectiveness in 9×9 Go by showing that SB surpasses the well-known supervised learning algorithm Minorization-Maximization (MM) by about 90 Elo. The second contribution is systematic experiments of various time management schemes for 19×19 Go. The results indicate that clever time management algorithms can considerably improve playing strength. All the experiments were performed on our Go-playing program ERICA, which benefitted from these heuristics and the experimental results to win the gold medal in the 19×19 Go tournament at the 2010 Computer Olympiad.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0893080079
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0893080079%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106941
dc.language英文
dc.subject人工智慧zh_TW
dc.subject圍棋zh_TW
dc.subject電腦圍棋zh_TW
dc.subject蒙地卡羅樹搜尋zh_TW
dc.subject樹狀結構信賴上界法zh_TW
dc.subject模擬平衡化zh_TW
dc.subject時間控制zh_TW
dc.subjectEricazh_TW
dc.subjectArtificial Intelligenceen_US
dc.subjectGoen_US
dc.subjectcomputer Goen_US
dc.subjectMonte Carlo Tree Search (MCTS)en_US
dc.subjectUpper Confidence bounds applied to Trees (UCT)en_US
dc.subjectSimulation Balancingen_US
dc.subjectTime Managementen_US
dc.subjectEricaen_US
dc.title應用於電腦圍棋之蒙地卡羅樹搜尋法的新啟發式演算法zh_TW
dc.titleNew Heuristics for Monte Carlo Tree Search Applied to the Game of Goen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n089308007901.pdf
Size:
3.59 MB
Format:
Adobe Portable Document Format

Collections