開放領域中文問答系統之建置與評估

楊平; Yang, Ping

開放領域中文問答系統之建置與評估

dc.contributor	曾元顯	zh_TW
dc.contributor	Tseng, Yuen-Hsien	en_US
dc.contributor.author	楊平	zh_TW
dc.contributor.author	Yang, Ping	en_US
dc.date.accessioned	2022-06-08T03:01:20Z
dc.date.available	2021-12-31
dc.date.available	2022-06-08T03:01:20Z
dc.date.issued	2021
dc.description.abstract	近年來隨著人工智慧技術日新月異，答案抽取式機器閱讀理解模型在 SQuAD 等資料集上已可超出人類的表現。而基於機器閱讀理解模型，加入了文章庫以及文件檢索器的問答系統架構，亦取得良好的成績。然而這樣子的資料集測試成效於實際應用上，可以達到什麼樣的效果是本研究好奇的問題。本研究主要進行了兩個任務，第一個為開發並比較不同的問答系統實作方式，以資料集自動化測試的方式評估何種實作方式的成效最好。第二個為將自動化測試表現最好的問答系統，交由受試者進行測試，並對實驗結果進行分析。最終得到的結果有四個。第一，本研究以中文維基百科做為文章庫；以Elasticsearch作為文件檢索器；以Bert-Base Chinese作為預訓練模型，並以DRCD資料集進行訓練的Sentence Pair Classification模型作為文件重排序器；以MacBERT-large作為預訓練模型，並以DRCD加上CMRC 2018資料集進行訓練的答案抽取式機器閱讀理解模型，作為文件閱讀器。此問答系統架構可以在Top 10取得本研究實驗的所有系統當中最好的成效，以DRCD Test set加上CMRC 2018 Dev set進行測試，得到的分數為F1 = 71.355，EM = 55.17。第二，本研究招募33位受試者，總計對系統進行了289道題目的測試，最終的成果為，在Top 10的時候有70.24%的問題能被系統回答，此分數介於自動化測試的F1與EM之間，代表自動化測試與使用者測試所得到的結果是相似的。第三，針對29.76%無法得到答案的問題進行分析，得到的結論是，大部分無法回答的原因是因為無法從文件庫中檢索正確的文章。第四，Top 1可回答的問題佔所有問題中的26.3%，而Top 2 ~ 10的佔比為43.94%。代表許多問題並非系統無法得出解答，而是排序位置不正確，若能建立更好的答案排序機制，將能大幅提升問答系統的實用性。	zh_TW
dc.description.abstract	With rapid development, artificial intelligence has surpassed human performance in span-extraction machine reading comprehension datasets such as SQuAD. And based on this achievement, question answering system architecture with documents collection, document retriever, and document reader, have also been achieved a good result. However, will the system gets similar results in the real world? That is a question our research curious about.Our research has two tasks. The first one is to develop and compare different QA system implementations using datasets. Second, we ask users to test the QA system and analysis the results.Finally, we got four results. First, with the QA system architecture of Chinese Wikipedia as the documents collection; Elasticsearch as the document retriever; Sentence Pair Classification model trained on DRCD dataset with Bert-Base Chinese pre-training model, as the document re-ranker; Span-extraction machine reading comprehension model trained on DRCD and CMRC 2018 dataset, with MacBERT-large pre-trained model, as the document reader. This architecture has achieved the best Top 10 result among all the systems tested in our research. The score is F1 = 71.355, EM = 55.17 in Top 10, tested on DRCD Test set and CMRC 2018 Dev set.Second, this study recruited 33 users and those users tested the system with 289 questions. The result is, 70.24% of the questions can be answered by the system in the Top 10. This score is between the F1 and EM scores of datasets testing, it means that the results of datasets testing and user testing are similar.Third, we analyzed 29.76% of the unanswered questions and find that most of the reasons were because the correct document could not be retrieved from the documents collection.Fourth, Top 1 answerable questions take 26.3% of all questions, while Top 2 ~ 10 take 43.94%. This means lots of questions are answerable if the sorting is correct. The practicality of the QA system will be greatly improved if a better answer sorting mechanism can be performed.	en_US
dc.description.sponsorship	圖書資訊學研究所	zh_TW
dc.identifier	60815008E-39805
dc.identifier.uri	https://etds.lib.ntnu.edu.tw/thesis/detail/4dd33ea0ba5bf376dfa4eb061a2af228/
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/118327
dc.language	中文
dc.subject	中文開放領域問答系統	zh_TW
dc.subject	問答系統使用者測試	zh_TW
dc.subject	機器閱讀理解	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	人工智慧	zh_TW
dc.subject	Chinese Open-Domain Question Answering System	en_US
dc.subject	User Testing of Question Answering System	en_US
dc.subject	Machine Reading Comprehension	en_US
dc.subject	Deep Learning	en_US
dc.subject	Artificial Intelligence	en_US
dc.title	開放領域中文問答系統之建置與評估	zh_TW
dc.title	Development and Evaluation of Chinese Open-Domain Question Answering System	en_US
dc.type	學術論文

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 60815008E-39805.pdf
Size:: 3.82 MB
Format:: Adobe Portable Document Format
Description:: 學術論文

Download

Collections

學位論文