利用自動化字幕偵測與字幕處理來擷取結構化之視訊內容

標題:	利用自動化字幕偵測與字幕處理來擷取結構化之視訊內容 Visual Structuring and Retrieval Based on Automatic Closed Caption Detection and Caption Processing
作者:	蕭銘和 Ming-Ho Hsiao 李素瑛 Suh-Yin Lee 資訊科學與工程研究所
關鍵字:	視訊切割;字幕偵測;字體大小辨識;Video Segmentation;Scene Identification;Caption Localization;Font Size Differentiation
公開日期:	2001
摘要:	我們利用階層式架構，提出一種結構化的網球影片內容瀏覽與索引方法。經過視訊切割，自動化字幕偵測與字幕字體大小辨識等方法，將影片做結構化的分析並建立在數位影片資料庫中。對數位影片資料庫而言，影片內容的結構化提供了瀏覽的能力而影片的字幕則提供更有意義的資訊。為了建構影片的階層式架構，我們提出並整合了一些視訊處理的技術，包括影片的視訊切割，選擇適當的視訊片段，偵測視訊片段是否有字幕以及字體大小的辨識的方法。我們選擇網球影片當作研究的實例，而且利用我們所設計的自動化選擇適當視訊片段的方法，來作進一步的字幕偵測。我們可在偵測到有字幕的視訊片段，做更精確地自動化字幕檢測。利用我們提出的字幕字體大小辨識的方法，使用者可以利用此技術來過濾及選擇更有意義的字幕資訊，如比賽分數、球員名字等。具有意義的字幕資訊不僅可提供對於高階層的視訊影片架構分析和視訊影片索引，更可作為MPEG７中內容描述的資訊。我們所有提出的方法都可直接在MPEG壓縮影片中做處理，不僅節省計算的時間，更可提高視訊影片處理的效率。此研究實驗結果證明了提出的方法令人滿意。 An efficient indexing and retrieval of tennis video content is proposed using hierarchical structure. The hierarchical structure is constructed through video segmentation, shots selection and closed caption detection. The video content representation provides browsing capabilities for digital video databases. The video indexing supports more efficient content-based queries and retrieval capabilities for digital video databases. In this thesis, a novel approach of automatic closed caption detection and font size differentiation among localized text regions in I-frames of MPEG videos is proposed. The approach consists of five modules: video segmentation, shot selection, caption frame detection, caption localization and font size differentiation. Tennis videos are selected as the case study and the module of shot selection is designed to automatically select specific type of shot for further closed caption detection. The noise of potential captions is filtered out based on the long-term consistency of the constant potential caption regions detection over consecutive frames. While the general closed captions are localized, the designed tool – font size differentiation is used as a filter to assist users in the selection of the specific and significant text captions. The significant closed captions, e.g. scores, can support high-level video structuring, video browsing, video indexing and video content description in MPEG-7. Experimental results show the effectiveness and the feasibility of the proposed scheme.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#NT900392049 http://hdl.handle.net/11536/68463
Appears in Collections:	Thesis