應用潛在語意分析於測驗題庫相似性之比對

郭榮芳; Kuo,Jung-Fang

應用潛在語意分析於測驗題庫相似性之比對

dc.contributor	何榮桂	zh_TW
dc.contributor.author	郭榮芳	zh_TW
dc.contributor.author	Kuo,Jung-Fang	en_US
dc.date.accessioned	2019-08-29T07:44:53Z
dc.date.available	2005-07-31
dc.date.available	2019-08-29T07:44:53Z
dc.date.issued	2005
dc.description.abstract	本研究旨在應用資訊檢索技術中潛在語意分析（latent semantic analysis，LSA）的方法，分析題庫中的試題是否有相同或相似的情形，並探討使用潛在語意分析時，冗詞去除與否、權重的調整與維度約化（dimension reduction）對結果的影響，研究目的有二：一、探討潛在語意分析是否能有效找出題庫中相同或相似的試題？二、探討使用潛在語意分析時，冗詞去除與否、何種調整權重方式與約化的維度，在分析試題相似度時效果較佳？本研究使用「電腦軟體應用技能檢定丙級學科」92年與93年共1000題選擇題為題庫，並將其試題與試題間的相似度分為完全相同、非常相似、部分相似與些微相似四類，研究結論如下：一、有去除冗詞在分析各種相似程度的試題其效果皆優於無去除冗詞者。二、適合本題庫調整詞彙與試題關係矩陣權重的方式為log-entropy。三、判斷兩試題是否完全相同時，保留的維度愈高精確率愈高，判斷兩試題是否非常相似、部分相似與些微相似時，保留維度依序為30、15與14時，精確率較佳。四、對於本題庫中（1）用詞完全相同、（2）部分辭彙不同、（3）敘述方式不同，但題意相同、（4）辭彙不同，但意義相同四類試題，系統皆能正確的判斷出來。	zh_TW
dc.description.abstract	The purpose of this study is to apply latent semantic analysis(LSA) to analyze item bank whether it does have the same or similar item, and discuss to use LSA, whether the common words remove or not, the weight adjustmentand dimension reduction, the influence to the result. Two major purposes of this study are. 1.Discusses latent semantic analysis whether can effectively discover the same or similar item in the item bank? 2.Discusses the use of latent semantic analysis, whether the common words remove or not, what method of weight adjustment and the number of dimension reduction is better to analyze item bank similarity? This research use "the computer software application skill examination - grade-C course" of the years 92& 93 which have 1,000 multiple choices items as item bank, And classified four kind of similarity, completely identical, extremely similar, partially similar and slightly similar. The research conclusion is as follows: 1.When analyzing each similar degree item , the effect of removing common words is better than not removing common words. 2.The method used of weight adjustment for term-by-document matrix to suit this item bank is log-entropy. 3.Judging two item whether completely identical , the retention dimension higher precision rate is better. Judging two item whether extremely similar, partially similar and slightly similar , when the retention dimension is 30, 15 and 14, the precision rate is better. 4.Regarding (1) the phrase is completely identical, (2) the partial phrase is different, (3) the statement is different, but meaning is identical, (4) the phrase is different, but the meaning is identical, the four kind of item, the system all can correctly judge.	en_US
dc.description.sponsorship	資訊教育研究所	zh_TW
dc.identifier	G0069008034
dc.identifier.uri	http://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G0069008034%22.&%22.id.&
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/92653
dc.language	中文
dc.subject	潛在語意分析	zh_TW
dc.subject	題庫	zh_TW
dc.subject	相似試題	zh_TW
dc.subject	latent semantic analysis	en_US
dc.subject	item bank	en_US
dc.subject	similar item	en_US
dc.title	應用潛在語意分析於測驗題庫相似性之比對	zh_TW
dc.title	Applying Latent Semantic Analysis for the Comparison of Item Bank Similarity	en_US