檢測以韋伯分布為基線之混合風險模型的離群值
No Thumbnail Available
Date
2018
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
離群值的檢測(Outlier detection)是統計分析方法中很重要的議題,是一種針對資料中極度異於其它資料的事件或觀測值的識別。適時地找出這些觀測值並加以處理,可以改善統計分析結果且合理解釋資料模型。在生活中離群值檢測常見的應用於結構缺陷、醫療問題等類型的問題。
在醫療問題中,Cox比例風險模型(Cox proportional Hazard model)是存活分析被廣為應用的分析模型,主要用於探討存活時間與自變項(Covariate)的關係。因此,有許多學者提出針對風險模型的離群值檢測,但較少著墨於混和風險模型(mixture hazard model)。然而,混和風險模型在這個領域也越來越被重視,因為實際在醫學中,疾病會被區分成許多類型(group),因此發展出一個適用於混和風險模型之檢測離群值及估計模型的方法是很重要的,此論文即探討此模型之離群值檢測及模型估計。
本論文將針對醫學研究領域最廣為應用的混和風險模型來探討離群值的檢測,並以韋伯分布為基線。利用收縮參數(shrinkage parameter)對現有的概似函數加入懲罰(penalty)函數項,以EM演算法估計收縮參數來檢測資料中的離群值,再進一步對離群值加權或是刪除以調整模型將參數估計做最佳化。
根據模擬顯示,此方法有效的偵測出資料中的離群值,且利用刪除離群值的方法通常可以得到較好的參數估計。
Outlier detection is an important issue in statistical analysis. It is a method to identify the data or observations which have extreme abnormalities in the dataset. Detecting these observations and treating them appropriately can improve the result of estimation and reasonably interpret models. Outlier detection is commonly applied to structural defects, medical problems, and other types of problems. In the medical problem, the Cox proportional hazard model is the most widely used model in survival analysis. It is mainly used to explore the relationship between survival time and covariate. Although many approaches have been proposed for the outliers detection in survival model, few of them consider the about outlier detection in mixture hazard model. However, the analysis of mixture hazard model is popular recently because the diseases would often be divided into many groups based on the causes in medicine. As a result, it is very important to develop a method for detecting outliers and fitting the estimation of the mixture hazard models, and this thesis is discussing about this issue. In this thesis, we focus on the detection of the mixture hazard model based on the Weibull mixture hazard model. We introduce the shrinkage parameters in the penalized likelihood function to detect the outliers, and develop EM algorithm to estimate the shrinkage parameter. After detecting possible outliers, we refit the model parameters either by weighting or deleting the outliers. The simulation results reveal that the proposed method can detect the outliers of the mixture hazard model effectively. Additionally, using the outlier-deleting method can obtain better parameter estimates, in the sense of smaller bias, generally.
Outlier detection is an important issue in statistical analysis. It is a method to identify the data or observations which have extreme abnormalities in the dataset. Detecting these observations and treating them appropriately can improve the result of estimation and reasonably interpret models. Outlier detection is commonly applied to structural defects, medical problems, and other types of problems. In the medical problem, the Cox proportional hazard model is the most widely used model in survival analysis. It is mainly used to explore the relationship between survival time and covariate. Although many approaches have been proposed for the outliers detection in survival model, few of them consider the about outlier detection in mixture hazard model. However, the analysis of mixture hazard model is popular recently because the diseases would often be divided into many groups based on the causes in medicine. As a result, it is very important to develop a method for detecting outliers and fitting the estimation of the mixture hazard models, and this thesis is discussing about this issue. In this thesis, we focus on the detection of the mixture hazard model based on the Weibull mixture hazard model. We introduce the shrinkage parameters in the penalized likelihood function to detect the outliers, and develop EM algorithm to estimate the shrinkage parameter. After detecting possible outliers, we refit the model parameters either by weighting or deleting the outliers. The simulation results reveal that the proposed method can detect the outliers of the mixture hazard model effectively. Additionally, using the outlier-deleting method can obtain better parameter estimates, in the sense of smaller bias, generally.
Description
Keywords
離群值檢測, 混和風險模型, 韋伯分布, 懲罰函數, EM演算法, Outlier detection, Mixture hazard model, Weibull distribution, Penalty function, EM algorithm