電腦輔助會議口譯準備之成效初探—以政治類題目為例
No Thumbnail Available
Date
2023
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
在翻譯研究的範疇內,翻譯科技的學術討論大多聚焦於筆譯領域;口譯科技的論文發表不免相形見絀。近年來,科技發展日新月異,可作為口譯輔助的語言科技也不斷進步。本研究旨在探討相關語言科技軟體(AntConc)與程式語言套件(spaCy)對於口譯準備的成效與影響。研究先透過Python語言套件(newspaper3k)找出某一演講介紹稿中的種子詞,再以半自動方式(BootCaT)上網搜尋並下載相關的文章或資料,最後根據該主題創建專門語料庫。從中藉由不同的統計學或語言學計算方式抽取出不同的詞表,表內關鍵詞或關鍵詞組之於演講逐字稿的覆蓋率越高,表示該詞表品質越好。本研究結果發現,該自建專門語料庫的詞覆蓋率將近七成;而各類關鍵詞表(TF、TF-IDF、LLR)雖沒有亮眼的覆蓋率(12%、37%、59%),但從AntConc獲得的詞表內容、N-gram、搭配詞(collocates)和前後文例句(concordance)卻可補足口譯準備過程中的不同需求:建立背景知識、找出重要英文搭配詞、協助預想關鍵詞的翻譯方式等。除此之外,透過關鍵詞與N-gram技術,較常出現的習語與專門領域的特定語言使用都可以直接呈現出來。本研究指出,相關數位工具或可有效降低會議準備時間,並提高口譯服務的品質。在未來,口譯準備工作的自動化整合新興科技,勢必有助於口譯科技的普及。
With the rapid development of technology, digital language tools in interpreting are attracting scholarly attention. This study aims to unveil the potential of language tools, like AntConc, and Python libraries, like spaCy, on interpreters’ conference preparation. This research uses a Python library (Newspaper3k) to identify seed words from the introductory texts to the chosen speech. Seed words are used for online searches, and the related contents are semi-automatically downloaded and compiled into a specialized corpus by BootCaT. Keywords and keyphrases are extracted statistically or linguistically and compared against the target corpus built from the speech transcript for the coverage rate; the higher, the better. The results show that the overall coverage rate of the specialized corpus is saliently at 70%. Tokens in the keyword/ keyphrase lists, N-grams, collocates, and concordance results retrieved through AntConc seem helpful for interpreters on different fronts, including background knowledge building and prefabricated lexical bundle identification. Keywords with N-grams can also offer quicker access for interpreters to create interpreting glossaries more effectively and with better quality. This study has demonstrated that digital tools may enhance preparation efficiency and improve interpretation quality. In the future, the interpreting preparation automation workflow, if integrated with other new tools like machine translation, is expected to gain popularity and make new technologies more accessible to all.
With the rapid development of technology, digital language tools in interpreting are attracting scholarly attention. This study aims to unveil the potential of language tools, like AntConc, and Python libraries, like spaCy, on interpreters’ conference preparation. This research uses a Python library (Newspaper3k) to identify seed words from the introductory texts to the chosen speech. Seed words are used for online searches, and the related contents are semi-automatically downloaded and compiled into a specialized corpus by BootCaT. Keywords and keyphrases are extracted statistically or linguistically and compared against the target corpus built from the speech transcript for the coverage rate; the higher, the better. The results show that the overall coverage rate of the specialized corpus is saliently at 70%. Tokens in the keyword/ keyphrase lists, N-grams, collocates, and concordance results retrieved through AntConc seem helpful for interpreters on different fronts, including background knowledge building and prefabricated lexical bundle identification. Keywords with N-grams can also offer quicker access for interpreters to create interpreting glossaries more effectively and with better quality. This study has demonstrated that digital tools may enhance preparation efficiency and improve interpretation quality. In the future, the interpreting preparation automation workflow, if integrated with other new tools like machine translation, is expected to gain popularity and make new technologies more accessible to all.
Description
Keywords
口譯會議準備, 關鍵詞表, 自動化專門語料庫搜集, 口譯科技, 電腦輔助口譯工具, conference preparation, keyword list, keyphrase list, specialized corpus compilation, interpreting technology, CAI tools