整合退火演算法與正交實驗設計法改善K-Means演算法之分類
No Thumbnail Available
Date
2008
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
在資料探勘研究領域中,資料分群的方法扮演相當重要的角色,被廣泛應在不同的領域中。一個好的分群演算法能快速而正確的分群,進而可以突顯群集的特性,以便後續的資料處理分析動作。k-means是最為常用的分群演算法,雖具有快速分群以及簡單運用之特性,但也存在許多缺點導致其分群效果未必能達成最佳的分群結果。本文為了增進k-means的分群結果,因此提出以模擬退火演算法為基礎的分群方式,此方法簡稱為HSAKM。
首先找出第一次k-means分群後之中心點,在每群中找最接近原中心點的m個資料點作為新的中心,再經過退火擾動分群動作,可增進K-Means分群的效能。本研究將與K-Means以及與其它的分群分群演算法相互比較,以驗證所提的出方法較能有更好的效能。
The data clustering acted as crucial role for data mining, and it is used in more different fields extensively. A good clustering algorithm can quickly and correctly for partition the dataset into clusters. Also, it can point out the cluster characteristics, and then may be provided for further data analysis work. The k-means algorithm is one of the most commonly used clustering algorithms. However, the k-means clustering result doesn’t reach the best clustering result, so in order to promote the k-means algorithm to get the best or near-best clustering result. Therefore, we propose an integrated approach based on SA algorithm called HSAKM for data clustering. First k-means clustering to find cluster centers. Then search m data with the shortest distance to the cluster center of each cluster as new cluster centers. Moreover, perform the SA clustering to improving the efficiency for the k-means. Furthermore, compare HSAKM algorithm with each other clustering algorithms to demonstrate our proposed clustering algorithm is much better than others.
The data clustering acted as crucial role for data mining, and it is used in more different fields extensively. A good clustering algorithm can quickly and correctly for partition the dataset into clusters. Also, it can point out the cluster characteristics, and then may be provided for further data analysis work. The k-means algorithm is one of the most commonly used clustering algorithms. However, the k-means clustering result doesn’t reach the best clustering result, so in order to promote the k-means algorithm to get the best or near-best clustering result. Therefore, we propose an integrated approach based on SA algorithm called HSAKM for data clustering. First k-means clustering to find cluster centers. Then search m data with the shortest distance to the cluster center of each cluster as new cluster centers. Moreover, perform the SA clustering to improving the efficiency for the k-means. Furthermore, compare HSAKM algorithm with each other clustering algorithms to demonstrate our proposed clustering algorithm is much better than others.
Description
Keywords
k-means演算法, 正交實驗設計, 模擬退火法, k-means algorithm, orthogonal experimental design, simulated annealing algorithm