利用文本分類模型評量台灣高中生的科學探究觀

dc.contributor邱瓊慧zh_TW
dc.contributorChiu, Chiung-Huien_US
dc.contributor.author吳家榮zh_TW
dc.contributor.authorUng, Ka-Wengen_US
dc.date.accessioned2023-12-08T07:40:42Z
dc.date.available2027-11-15
dc.date.available2023-12-08T07:40:42Z
dc.date.issued2022
dc.description.abstract來自台灣新北市某高中的538名10-11年級學生和來自台灣中部的65名10-11年級學生參與了這項為期兩年的研究。通過使用各種自動評量方法,本研究通過評估了各種方法在科學探究觀點的開放式中文版問卷上的評量表現,探討自動評量的可行性。研究的結果表明﹕(1)在八個不同 aspect 的題目情景下,比起單一分類模型,將各種表現好的傳統分類模型相混合在大部分情景下是能夠提高評量的表現。(2)在八個不同 aspect 的題目情景下,預先訓練好的BERT所微調出來的模型的評量表現都是顯著優於傳統分類模型、混合模型以及在Google的AutoML中訓練的模型。(3)在八個不同 aspect 的題目情景下,過採樣法和資料增強方法(機器翻譯)都能夠有效地提高各模型的性能。(4)每個問題的答案字數與模型的性能有一定程度的負相關,但訓練資料集的大小與模型性能沒有相關。基於上述成果,我們可以認為BERT和一些集成學習的技術(例如混合模型),再加以搭配過採樣法或資料增強方法(機器翻譯)以應對現實教育環境收集到的不平衡資料,可以切實有效地幫助研究人員建立自動評量系統來探索學生們的科學探究觀點是屬於哪一類型(天真型、過渡型、知識型)。此外,本研究發現答案的長度會影響到模型分類的效能,但資料的多少則不會影響,建議可以繼續深入地進行相關研究,作為這些技術的後續應用,教師們可以在使用問卷後更快地識別出表現好的和表現差的學生,對他們進行干預,或以其他措施幫助他們。zh_TW
dc.description.abstractThis study investigated the feasibility of various automated scales by evaluating the performance of each method on an open-ended Chinese version of the Views About Scientific Inquiry questionnaire. A total of 538 10–11th-grade students from a high school in New Taipei City, Taiwan and 65 10–11th-grade students from central Taiwan participated in this two-year study. The results of the study showed that (1) blends of well-performing traditional classfication models outperformed single classification models for more than half of the questions testing eight different aspects of scientific inquiry; (2) the pre-trained BERT fine-tuned models significantly outperformed the traditional classification models, the blending models, and the models trained in Google's AutoML for most aspects; (3) the oversampling and data augmentation methods (machine translation) were effective at improving the performance of each model for each aspect; (4) answer word length was negatively correlated with model performance some extent, but the size of the training data set was not correlated with model performance. We conclude that BERT and some integrated learning techniques (e.g., blending models), coupled with oversampling or data augmentation methods (machine translation) to cope with imbalanced data could assist researchers in building automated assessment systems to assess which type of scientific inquiry perspective students hold (naive, transitional, informed). In addition, this study found that answer length affected model classification, but the amount of data did not. Considering these results, teachers can use the questionnaires to quickly identify and help high- and low-performing students.en_US
dc.description.sponsorship資訊教育研究所zh_TW
dc.identifier60808026E-42608
dc.identifier.urihttps://etds.lib.ntnu.edu.tw/thesis/detail/73f7354feeaae1ab407f8d348132f797/
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw/handle/20.500.12235/119797
dc.language英文
dc.subject科學探究zh_TW
dc.subject科學探究觀點zh_TW
dc.subject分類模型zh_TW
dc.subjectscientific inquiryen_US
dc.subjectviews about scientific inquiryen_US
dc.subjectclassification modelen_US
dc.title利用文本分類模型評量台灣高中生的科學探究觀zh_TW
dc.titleEvaluating Views About Scientific Inquiry (VASI) of Taiwanese High-School Students Using Text Classification Modelsen_US
dc.typeetd

Files

Collections