以機率為基礎的語意分析之物件辨識研究

No Thumbnail Available

Date

2009

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

使用影像中具有語意資訊的內容來作物件辨識,應該比使用低階特徵來辨識更為合理。為了克服語意隔閡,也就是高階與低階影像特徵之間的差距,我們提出一個非監督式的方法,藉由收集影像中的高階資訊,建構出一個新的影像表示法,我們將之命名為以機率為基礎的語意組成描述子(pSCD)。首先,我們將低階影像特徵量化,藉此得到一組視覺字組。接著我們利用修改過的pLSA模型來分析在視覺字組與影像間,包含哪些具有語意資訊的隱藏類別。利用這些隱藏類別,我們可以建構出pSCD,並將之應用在物件辨識上。另外,我們也會討論隱藏類別的數量多寡對pSCD的影響。最後,藉由物件辨識的實驗,我們證明了pSCD比起其它的影像表示法更加具有辨別性,例如袋字表示法或pLSA表示法。
Object recognition based on semantic contents of images is more reasonable than that based on low-level image features. In order to bridge the semantic gap between low-level image features and high-level concepts in human cognition, we presents an unsupervised approach to build a new image representation, which is called probabilistic semantic component descriptor (pSCD), by collecting high-level concepts from images. We first quantize low-level features into a set of visual words, and then we apply a revised model of probabilistic Latent Semantic Analysis (pLSA) to analyze what kinds of hidden concepts between visual words and images are involved. After collecting these discovered concepts, we could build pSCD for object recognition. We also discuss how many hidden concepts are appropriate for pSCD to describe a set of images. Finally, through object recognition experiments, we demonstrate that pSCD is more discriminative than other image representations, including Bag-of-Words (BoW) and pLSA representations.

Description

Keywords

物件辨識, 語意隔閡, 視覺字組, 袋字模型, 影像表示法, object recognition, semantic gap, visual word, bag-of-words model, image representation

Citation

Collections