Please use this identifier to cite or link to this item:
Title: 自動擷取英文搭配語及中英文例句:雙語辭典編纂學的計算工具
Other Titles: Automatic Extraction of English Collocations and their Chinese-English Bilingual Examples: A Computational Tool for Bilingual Lexicography
Authors: 高照明
Zhao-Ming Gao
Issue Date: May-2014
Publisher: 英語學系
Department of English, NTNU
Abstract: 本文描述英中雙語搭配語自動編纂線上系統EXEC 的設計流程。EXEC 由一千三百萬英文詞及二千七百萬中文字的中英雙語平行語料庫建立而成,結合英語搭配語檢索和中英雙語檢索功能。EXEC 利用統計以及具有依存關係的英文句法剖析器擷取英文搭配語。使用者在查詢時輸入關鍵詞和關鍵詞的詞性以及所搜尋的搭配語的詞性,程式依據英文句法剖析器的依存關係和mutual information、t-score、loglikelihood ratio 等統計訊息自動擷取可能的英文搭配語,並連結包含英文搭配語的英文例句及中文翻譯。實驗顯示EXEC 在擷取的正確率和辭典的涵蓋率都超過80%且可以很有效率地自動從平行語料擷取英文搭配語、例句、及中文翻譯。
This paper describes the procedures involved in developing EXEC, a web-based system which can automatically extract English collocations and their Chinese-English bilingual examples from parallel corpora. The system draws on statistics, dependency parsing, and Chinese-English parallel corpora of more than 13 million English words and 27 million Chinese characters. By taking a word as well as the parts-of-speech of the word and its collocate as input, the system can automatically generate collocation candidates based on syntactic dependency relations as well as statistical information regarding mutual information, t-scores, and log likelihood ratios. In conjunction with a Chinese-English bilingual concordancer, it can further extract English sentences containing identified collocations along with their Chinese translations. Our evaluations suggest that the proposed system performs reasonably well in terms of accuracy and efficiency. EXEC can be used in facilitating automatic compilation of bilingual collocation dictionaries as well as in overcoming the L2 language barrier for Chinese learners of English.
Other Identifiers: BE3A12CE-3804-9001-0A4B-2436A2449EB9
Appears in Collections:同心圓:語言學研究

Files in This Item:
There are no files associated with this item.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.