通過間接視覺語義對齊改進廣義零樣本學習的視覺表徵

陳彥合; Chen, Yan-He

通過間接視覺語義對齊改進廣義零樣本學習的視覺表徵

dc.contributor	葉梅珍	zh_TW
dc.contributor	Yeh, Mei-Chen	en_US
dc.contributor.author	陳彥合	zh_TW
dc.contributor.author	Chen, Yan-He	en_US
dc.date.accessioned	2023-12-08T08:02:42Z
dc.date.available	2027-07-01
dc.date.available	2023-12-08T08:02:42Z
dc.date.issued	2022
dc.description.abstract	我們探討廣義零樣本學習的問題，其任務是預測目標圖像的標籤，無論其標籤屬於可見類別或是未見類別。我們發現大多數方法都學習了一個聯合嵌入空間，其中圖像特徵及其相應的類原型是對齊的。由於視覺空間和語義空間之間的固有差距，這種直接對齊可能很困難。我們提出放寬對齊要求，避免在圖像和語意嵌入之間進行成對比較，來實現一個新的學習框架。我們提出的間接視覺語意對齊方法 (Soft Visual-Semantic Alignment)，是通過對由精粹後的視覺特徵和目標類的類原型組成的連接特徵向量進行分類。此外我們使用圓損失(Circle Loss)來優化嵌入模型，該損失函數允許對不同的類內和類間相似性進行不同的懲罰強度。我們廣泛的實驗表明，間接對齊方式在學習區辨性和廣義視覺特徵方面更加靈活。我們證明了所提出方法的優越性，其性能與五個基準上的最新技術相當。	zh_TW
dc.description.abstract	We address the problem of generalized zero-shot learning where the task is to predict the label of a target image whether its label belongs to the seen or unseen category. We find a majority of methods learn a joint embedding space where image features and their corresponding class prototypes are aligned. Such a direct alignment can be difficult, because of the inherent gap between the visual and the semantic space. We propose to relax the alignment requirement, accomplished by a learning framework that avoids performing pair-wise comparisons between the image and the class embeddings. The soft visual-semantic alignment is performed by classifying a concatenated feature vector consisting of the refined visual features and the class prototype of the target class. Furthermore, we employ circle loss to optimize the embedding model that allows different penalty strength on different within-class and between-class similarities. Our extensive experiments show that the indirect alignment manner is more flexible to learn discriminative and generalized visual features. We demonstrate the superiority of the proposed method with performance on par with the state of the art on five benchmarks.	en_US
dc.description.sponsorship	資訊工程學系	zh_TW
dc.identifier	60947074S-41548
dc.identifier.uri	https://etds.lib.ntnu.edu.tw/thesis/detail/c42929ab495fe48d8367a6aef2440c73/
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/121598
dc.language	中文
dc.subject	廣義零樣本學習	zh_TW
dc.subject	細粒度視覺辨識	zh_TW
dc.subject	視覺語義嵌入	zh_TW
dc.subject	間接對齊	zh_TW
dc.subject	圓損失函數	zh_TW
dc.subject	Generalized Zero-Shot Learning	en_US
dc.subject	Fine-Grained Visual Recognition	en_US
dc.subject	Visual-Semantic Embedding	en_US
dc.subject	Soft Alignment	en_US
dc.subject	Circle Loss	en_US
dc.title	通過間接視覺語義對齊改進廣義零樣本學習的視覺表徵	zh_TW
dc.title	Refining Visual Representation for Generalized Zero-Shot Learningvia Soft Visual-Semantic Alignment	en_US
dc.type	etd

Collections

學位論文

通過間接視覺語義對齊改進廣義零樣本學習的視覺表徵

Files

Collections