以投影片單應性映射之相關特徵進行演 講影片分析研究
No Thumbnail Available
Date
2014
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
投影簡報檔是講者用來輔助說明、提供註記,以及引領觀眾快熟掌握重點的工 具,但缺點是無法清楚表達細節;配合影音多媒體串流,不論在教學、會議或是演說 等場合中,可以完整的提供觀眾更多的細節資訊,但是也因此需要一個更有效率的方 式來瀏覽內容。
本文提出一個有效率而且準確的方法,將投影片以及影音內容配對。主要流程分 為三個部份,首先找出影音串流的候選影像,以減少後續配對是所花的時間,接著找 出影像特徵,計算候選影像及投影片匹配的特徵點,取得相似度。然後使用鄰近點差 異及隨機抽樣一致將可信度低的特徵濾除;如果條件許可,利用單應性特徵得到投 影片的約略位置。利用投影片在畫面中的比例將畫面分類為「有投影片」及「無投影 片」兩類畫面。接著將「有投影片」的部分在利用前面取得的相似度直接配對,並且 利用投票機制修正結果,最後可以正確的找出 96% 的畫面切換時間點。
Matching slides with video data frame is a method to provide users a quick way skim over the whole video by any given slide content, and will also help people quickly to jump to any point in the video, which may improve the user expereinces. But manually add mark to each time stamp in the video is time wasting. In this research , we develop an automatic process to achieve this. By given slides file and video file input, the proposed method will output segmented results. First, we use a heuristic method to eliminate duplicated and similar frames in recorded speech video. Then applying matching process based on SIFT. Then the matched candidates would be filtered by nearest neighbor ranking, which is suggested by D.G. Lowe. After we got matched candidates, a non-slide-frame detection will prune frames without slides displayed. Before output, we refine the recognition results with context scoring machanisims. The applying to a voting schema to improve the results of frame-slides pairs, and were achieved about 96% coverages of slide-frame switches.
Matching slides with video data frame is a method to provide users a quick way skim over the whole video by any given slide content, and will also help people quickly to jump to any point in the video, which may improve the user expereinces. But manually add mark to each time stamp in the video is time wasting. In this research , we develop an automatic process to achieve this. By given slides file and video file input, the proposed method will output segmented results. First, we use a heuristic method to eliminate duplicated and similar frames in recorded speech video. Then applying matching process based on SIFT. Then the matched candidates would be filtered by nearest neighbor ranking, which is suggested by D.G. Lowe. After we got matched candidates, a non-slide-frame detection will prune frames without slides displayed. Before output, we refine the recognition results with context scoring machanisims. The applying to a voting schema to improve the results of frame-slides pairs, and were achieved about 96% coverages of slide-frame switches.
Description
Keywords
候選畫面, 影像匹配, 隨機抽樣一致, 影像單應性, Candidate frame extraction, Slide-frame matching, RANSAC, Homography