柯佳伶Koh, Jia-ling朱聖池Chu, Sheng-Chih2019-09-052016-08-172019-09-052015http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060247031S%22.&%22.id.&http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106361圖型為具結構性的資料結構,能夠完整描述真實世界的資料關聯性。而單一圖型中的常見鄰近樣式在支持度計算上不僅能維持向下包含特性之外,同時支持度也較能提供明確的意義。我們認為先前研究提出探勘常見鄰近樣式之FNP演算法,因採用類似Apriori 演算法的圖型組合方式,將列舉過多不存在的候選樣式。因此本論文採用先前gSpan圖型成長的方法,進行常見鄰近子圖樣式探勘。本論文所提出的gSFNP方法,利用內嵌圖型結構來加速圖型成長,並運用最小深先搜尋碼來檢查圖型同構問題,同時提出在MapReduce平行框架處理的gSFNP_MR演算法以解決系統記憶體不足的問題。由實驗結果顯示,此兩種方法在探勘鄰近子圖樣式的執行時間上都能比FNP更快速地探勘出常見鄰近樣式。Graph is a powerful abstraction of structural data, which is applied to model the various relations among data in a real world. Recently, a new kind of patterns called frequent neighborhood patterns is defined for a large labeled graph. Frequent neighborhood patterns have the downward closure property of the support measure and provide meaningful interpretations of pattern mining. The previous work used an Apriori-like approach to combine the discovered frequent neighborhood patterns into larger candidate patterns, many of the generated candidates may not appear in the graph. In this thesis, we propose an algorithm, which is called the gSFNP algorithm, to find frequent neighborhood patterns efficiently. By applying a pattern growth approach, a data structure of pattern is designed to store the information of the matched sub-graphs for speeding up the following pattern growth computation. Besides, the minimum DFS code of a pattern is used to avoid finding graph isomorphic patterns. Moreover, we propose another MapReduce version of the gSFNP algorithm, which is called the gSFNP_MR algorithm, to solve the problem of insufficient memory in a centralized environment. Finally, we evaluate the performance of gSFNP and gSFNP_MR. The experimental results show that both the proposed algorithms have shorter response time than the previous work.樣式探勘常見鄰近樣式單一圖型探勘pattern miningfrequent neighborhood patternsingle graph mining從單一圖型中有效率探勘常見鄰近樣式之研究Efficiently Finding Neighborhood Patterns in a Large Labeled Graph