資料流最近常見項目集變動探勘之研究

dc.contributor柯佳伶zh_TW
dc.contributorJia-Ling Kohen_US
dc.contributor.author李蕙君zh_TW
dc.contributor.authorHuei-Jyun Lien_US
dc.date.accessioned2019-09-05T11:28:52Z
dc.date.available2009-7-10
dc.date.available2019-09-05T11:28:52Z
dc.date.issued2009
dc.description.abstract本論文針對資料流滑動視窗的模型提出一個探勘狀態變動項目集的方法,稱為CV-SCD(Cross-Verify Status Change Detection)演算法。本方法主要利用兩棵稱為Base-Tree及Delta-Tree的相同字首樹之樹狀結構,儲存在任一時間點t時滑動視窗中所有交易資料,以及從t到t+1之間新增及過時的交易資料,並利用Base-Tree及Delta-Tree的資訊判斷出狀態變動項目集,再同時對兩棵樹遞迴建立包含特定項目的條件樹,以探勘出更長的狀態變動項目集。本論文對固定區間長度探勘出的狀態變動項目集儲存成狀態變動資料項集快照,並採用金字塔式時間框架的結構來儲存快照,提供可由使用者指定特定時間區間對其中各狀態變動項目集的變動情形進行相對特性分析。實驗結果顯示,當新增及過時的交易資料相對於滑動視窗資料為少量,或是資料集中包含之項目種類較多,或是在支持度小的情況下,CV-SCD演算法相較於以FP-growth探勘出常見項目集後再進行狀態變動項目集比對可顯著增進執行效率。zh_TW
dc.description.abstractIn this thesis, a method for discovering recently status-changed itemsets over data streams is proposed, which is named the CV-SCD (Cross-Verify Status Change Detection) algorithm. In this algorithm, two prefix tree structures, which are named Base-Tree and Delta-Tree, respectively, are constructed for maintaining the transaction data in a sliding window at time t, and the newly inserted and removed transactions at time t+1. The data stored in the Base-Tree and Delta-Tree is used to discover the status-changed itemsets at t+1 with respect to t. By performing cross-verification on Delta-Tree and Base-Tree, then by constructing conditional Delta-Tree and conditional Base-Tree recursively, all the status-changed itemsets are discovered in depth-first search. The discovered status-changed itemsets within a fixed time interval are stored in a snapshot of status-changed itemsets. Accordingly, given a specific period of time interval, the changing characteristics of itemsets in this interval can be compared relatively. The experiment results show that FP-growth to discover all the frequent itemsets at each time point and then performing itemsets matching to get status-changed itemsets, the CV-SCD algorithm provides significant improvement on execution time when the added and expired transactions are few, or there are many different items in transaction data, or the support is small.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierGN0696470021
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22GN0696470021%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106702
dc.language中文
dc.subject資料流zh_TW
dc.subject滑動視窗zh_TW
dc.subject資料探勘zh_TW
dc.subjectdata streamen_US
dc.subjectsliding windowen_US
dc.subjectdata miningen_US
dc.title資料流最近常見項目集變動探勘之研究zh_TW
dc.titleMining Recently Frequent Itemsets Change over Data Streamsen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
n069647002101.pdf
Size:
893.81 KB
Format:
Adobe Portable Document Format

Collections