基於類神經之關聯詞向量表示於文本分類任務之研究

dc.contributor陳柏琳zh_TW
dc.contributorChen, Berliinen_US
dc.contributor.author石敬弘zh_TW
dc.contributor.authorShih, Chin-Hongen_US
dc.date.accessioned2019-09-05T11:13:09Z
dc.date.available2017-08-16
dc.date.available2019-09-05T11:13:09Z
dc.date.issued2017
dc.description.abstract由於資訊網路的蓬勃發展,人們在物聯網上存取文本資料的需求也與日俱增,因此文本分類在自然語言處理的領域中的應用為相當熱門的研究。目前,在文本分類中最為核心的問題為特徵表示的選擇,大部分的研究使用詞袋(Bag of words)模型做為文本的特徵表示,但詞袋模型無法有效的表達詞與詞之間的關係,進而失去了文本上的語意。 在本論文中,我們使用兩種新穎的類神經網路架構 : 連體網路(Siamese Nets)和生成式對抗網路(Generative Adversarial Nets), 在訓練過程中使模型能學習更為強健且帶有豐富語意的特徵表示。本論文實驗採用知名的分類資料庫,IMDB電影評論分類、20Newsgroups新聞群組分類,由一系列的情緒分析和主題分類的實驗結果顯示,藉由這些類神經網路所學習到的特徵表示可以有效地提昇文本分類的效能。zh_TW
dc.description.abstractWith the rapid global access to tremendous amounts of text data on the Internet, text categorization or classification has emerged as an important and hot research topic in the natural language processing (NLP) community with many applications. Currently, the foremost problem in text categorization would be feature representation, which is commonly based on the bag-of-words (BoW) model, where word unigrams, bigrams (n-grams) or some specifically designed patterns are typically extracted as the component features. It has been noted that the loss of word order raised by the BoW representations is particularly problematic on document categorization. In order to leverage the influence of word order and proximity information on text categorization tasks, we explore a novel use of a Siamese nets and Generative adversarial nets for document representation and text categorization. Experiments conducted on two benchmark text categorization tasks, viz. IMDB and 20Newsgroups, we take advantage of these novel architectures for learning distributed vector representations of documents that can reflect the semantic relatedness.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierG060447003S
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060447003S%22.&%22.id.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/106441
dc.language中文
dc.subject文本分類zh_TW
dc.subject表示學習zh_TW
dc.subject深度學習zh_TW
dc.subject連體網路zh_TW
dc.subject生成式對抗網路zh_TW
dc.subjectText Categorizationen_US
dc.subjectRepresentation Learningen_US
dc.subjectDeep Learningen_US
dc.subjectSiamese Networkwsen_US
dc.subjectGenerative Adversarial Networksen_US
dc.title基於類神經之關聯詞向量表示於文本分類任務之研究zh_TW
dc.titleNeural Relevance-aware Embedding For Text Categorizationen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
060447003s01.pdf
Size:
1.23 MB
Format:
Adobe Portable Document Format

Collections