基於word2vec的發散思維測驗之自動化評分技術發展
No Thumbnail Available
Date
2017
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
發散思維測驗藉由評量個人對開放性問題的反應數量與品質來評估個體的創造力潛力,亦可稱是最常用的創造力評量工具,通常以流暢力、獨創力及變通力作為評分指標。傳統的發散思維測驗計分方法有程序繁複、成本高昂等缺點,於是發展自動化評分技術成為一項受到關注的議題,預期藉由電腦計算的方法提供有效且便利的測驗結果。基於語意距離的自動化評分方法常見於創造力相關研究中,然而,基於語意距離的評分方法仍有可改進的空間且缺乏完整的信度、效度研究。本研究以word2vec為計算工具,提出一套基於語意距離的發散思維測驗之自動化評分方法,並檢驗此評分方法之信度、效度。
研究參與者為493位大學生並施測電腦化圖形之創造思考測驗,其中47人於間隔12個月後進行第二次施測,其測驗資料用於分析再測信度;其中99人另施測新編創造思考測驗圖形作業,其測驗資料用於分析校標關聯效度;其餘研究參與者之測驗資料用於分析常見答案。信度分析結果指出流暢力指標之信度係數可達.83 (p<.01);獨創力指標之信度係數可達.71 (p<.01);變通力指標之信度係數可達.76 (p<.01),由此可知,本研究之自動化評分方法具有良好的再測信度,表示能夠提供穩定且可靠的測驗結果。效度分析方面,本研究以新編創造思考測驗圖形作業為效標,結果得流暢力指標之相關係數可達.60 (p<.01);獨創力指標之相關係數可達.68 (p<.01);變通力指標之相關係數可達.58 (p<.01)。可見得此自動化評分方法具有不錯的效標關聯效度,表示能夠有效的測量個體的創造力潛力。
本研究之基於word2vec的發散思維測驗之自動化評分方法不僅能夠全自動化地執行發散思維測驗的評分工作,且研究結果顯示具備良好的信度與效度,可見得此自動化評分方法可以提供有效且便利的創造力潛力評量結果。
Divergent thinking (DT) tests are a commonly used method to assess an individual’s creative potential. Typically, these tests use fluency, originality, and flexibility as score indicators. Traditional scoring methods for DT tests is lengthy and complicated, and the costs are high. Thus, the development of an automated scoring method for DT tests has become a major issue. Automated scoring method based on semantic distance is commonly used in the related studies of creativity. However, scoring method based on semantic distance isn’t good enough, and lack the complete study of reliability and validity. This article reports an automated scoring method for DT tests based on word2vec, and examines reliability and validity of this scoring method. Participants (N=493) used a divergent thinking test, Computerized Creative Association and Figures Test (C-CRAFT), as its primary measure. Of the 493 students, 47 re-took the test 12 months later to assess test-retest reliability, and 99 also took the Wu’s Chinese Version of the Torrance Tests of Creative Thinking (CTTCT) to assess criterion-related validity. The remainder of the responses were used to analyze the commonality of responses from college students. The results show that the automated scoring method for DT tests based on word2vec has good test re-test reliability (r=.63~.83, p<.01) as well as good criterion-related validity (r=.51~.68, p<.01), and thus is a practically applicable method for DT test scoring. Limitation of the present study and directions for future research are offered.
Divergent thinking (DT) tests are a commonly used method to assess an individual’s creative potential. Typically, these tests use fluency, originality, and flexibility as score indicators. Traditional scoring methods for DT tests is lengthy and complicated, and the costs are high. Thus, the development of an automated scoring method for DT tests has become a major issue. Automated scoring method based on semantic distance is commonly used in the related studies of creativity. However, scoring method based on semantic distance isn’t good enough, and lack the complete study of reliability and validity. This article reports an automated scoring method for DT tests based on word2vec, and examines reliability and validity of this scoring method. Participants (N=493) used a divergent thinking test, Computerized Creative Association and Figures Test (C-CRAFT), as its primary measure. Of the 493 students, 47 re-took the test 12 months later to assess test-retest reliability, and 99 also took the Wu’s Chinese Version of the Torrance Tests of Creative Thinking (CTTCT) to assess criterion-related validity. The remainder of the responses were used to analyze the commonality of responses from college students. The results show that the automated scoring method for DT tests based on word2vec has good test re-test reliability (r=.63~.83, p<.01) as well as good criterion-related validity (r=.51~.68, p<.01), and thus is a practically applicable method for DT test scoring. Limitation of the present study and directions for future research are offered.
Description
Keywords
創造力測驗, 發散思維測驗, 語意距離, word2vec, creativity test, divergent thinking test, semantic distance, word2vec