一個關於切割影片以產生單一/多重場景背景之研究

標題:	一個關於切割影片以產生單一/多重場景背景之研究 A Study on Single/Multiple Sprite Generation and Partition for Videos
作者:	郭萓聖 I-Sheng Kuo 陳玲慧 Ling-Hwei Chen 資訊科學與工程研究所
關鍵字:	場景背景建立;場景背景編碼;多重場景背景;影像混合;影片分割;MPEG-4;Sprite Generation;Sprite Coding;Multiple Sprites;Image Blending;Sequence Splitting;MPEG-4
公開日期:	2007
摘要:	以物件為基礎編碼的MPEG-4採用了一種創新的場景背景編碼方式，該方法可以提高背景部分的編碼效率。MPEG-4所提出的場景背景產生系統，使用算數平均將所有影像疊合以產生場景背景影像，然而這樣的方式會使得產生出的場景背景影像中某些區域變得模糊，特別是曾經被移動物件佔據過的位置。為了防止產生背景模糊的狀況，MPEG-4建議使用者提供一個物件分割遮罩，標示畫面中屬於移動物件的位置，以避免物件被混入場景背景之中。我們依照MPEG-4所提出的架構，建立一個場景背景產生系統。但手動產生所有畫面的物件分割遮罩是不切實際的，因此在所建立的系統中，我們提出一個自動化物件分割遮罩產生方法。該方法先以不使用物件遮罩的方式產生粗糙場景背景，而後以粗糙背景為參考影像產生物件分割遮罩，最後再以所產生的遮罩重新產生較佳的場景背景。實驗結果顯示所提的系統產生之場景背景，具有良好的視覺品質。自動化影像分割方法所產生的物件分割遮罩，不可能非常完美的將所有移動物件與背景區分。未正確區分的物件分割遮罩，會使得部分移動物件被混合入背景影像之中。導致所產生的場景背景影像中，出現如鬼影般的移動物件殘骸。為了解決這個問題，我們將提出一個不需要物件分割遮罩的場景背景產生系統。所提出的系統包含兩個新方法：均勻化特徵點擷取方法與智慧型影像疊合方法。提出的特徵點擷取方法估計背景的運動向量，利用該向量將特徵點中屬於移動物件的點予以排除。同時以均勻化的擷取方式平均分散所有特徵點的位置。提出的智慧型影像疊合方法使用一種計數方法，使得只有屬於背景的點被混合入場景背景影像之中。實驗結果顯示所提出的均勻化特徵點擷取方法能有效的提高全域運動估計的準確性，因而提高場景背景影像之品質。提出的智慧型影像疊合方法則能夠將物件排除在疊合過程以外，使得以提出之方法產生的場景背景影像，不存在分割失誤可能導致的鬼影現象。其視覺品質接近使用人工產生之物件分割遮罩產生的場景背景影像，並優於Smolic et al.提出之使用自動化物件分割之場景背景產生方法。場景背景產生系統中，應用了幾何轉換將非參考畫面轉換至參考畫面的座標系統。進行幾何轉換會使轉換後畫面，以及根據轉換後畫面疊合的場景背景影像變的歪曲。這使得場景背景影像所需要的儲存空間增加，同時亦限制了場景背景影像所能夠涵蓋的視角。對此Farin et al.提出了使用多重場景背景的方式解決問題。使用多張場景背景影像所需的儲存空間總和，有可能較使用單一場景背景影像來的小，同時亦能涵蓋較大範圍的視角。然而Farin et al.所提出的方法，利用暴力搜尋法找出最佳的影片分割位置。若有N個畫面，這樣的方法需要O(N3)的執行時間與O(N2)的儲存空間。為了降低運算的複雜度，我們提出一個快速的多重場景背景影片分割方法。該方法包含一個可能的分割位置選取方法以及一個快速參考畫面選擇方法。利用測量畫面之間的移動與縮放，以找出影片中有可能的分割位置。並由這些可能的分割位置尋得最終的分割位置，將影片分割為數個子影片，最後每一個子影片將產生一個場景背景影像。若所提出的方法找到M個可能的分割位置，則所提出的方法僅需要O(M2N)的執行時間以及O(M2)+O(N)的儲存空間。同時所產生的數個場景背景影像的總儲存空間僅較暴力搜尋法所產生的略高。 Sprite coding, which can increase the coding efficiency of backgrounds greatly, is a novel technology adopted in MPEG-4 object-based coding. The sprite generator introduced in MPEG-4 blends frames by averaging blending, this will make some places, which are ever occupied by moving objects, look blurring. Thus, providing segmented masks for moving objects is suggested. We build a sprite generation system based on MPEG-4’s framework, but we find that using manual segmentation masks in a sprite generation system is impractical. An automatic segmentation mask generation method is proposed and is applied in the sprite generation system. The sprite generation system produces a coarse sprite first by MPEG-4’s method without segmentation masks. Then the coarse sprite is employed as the reference image in the proposed segmentation mask generation method. After generating the segmentation masks, a better sprite is re-generated again with generated segmentation masks. Experimental results show the sprite generated by the proposed system has good quality. Automatic image segmentation can not produce perfect object segmentation masks. Segmentation faults in segmentation masks causes some moving objects being blended into a sprite. This makes some ghost-like shadows appear in a generated sprite. To treat this problem, a sprite generation without segmentation masks is proposed in this dissertation. The proposed sprite generator consists of two novel methods: a balanced feature point extraction method and an intelligent blending method. The feature point extraction method estimates the motion vector of background pixels, and excludes pixels of moving objects from the feature points. Proposed intelligent blending method blends only background pixels into a sprite by a simple counting schema. Experimental results show the feature points extracted by the proposed method increases the accuracy of global motion estimation, and the quality of generated sprites is increased. The proposed intelligent blending method excludes pixels of moving objects directly in the blending procedure. Thus ghost-like shadows caused by segmentation faults is not exist in the sprite generated by our method. The visual quality of our sprite is close to that using manually segmented masks and is better than that generated by Smolic et al.’s method. Due to the geometric transformation applied to each non-reference frame in the procedure of sprite coding, the generated sprite is distorted and the available view angles relative to the reference frame are restricted. This makes multiple sprites used be necessary. An optimal multiple sprite generation method has been proposed by Farin et al., but it uses an exhaustive search to find the optimal partition and reference frames. Let N be the number of frames, Frains’ method requires O(N3) time and O(N2) space to perform the search. In order to reduce the complexity, a fast multiple sprite partition method is proposed in this dissertation. The proposed method includes a fast partition point finding method and a fast reference frame finding method. The proposed partition point finding method measures translation and scaling between frames and finds candidate partition points by the measured values. The final partition positions are decided from these candidate points, and reference frames of each partition are found by the proposed fast reference frame selecting method. Let M candidate partition points are found, the proposed method requires only O(M2N) in time and O(M2)+O(N) in space. The total size of generated sprites is only slightly higher than that of Farin’s method.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT008823816 http://hdl.handle.net/11536/64779
Appears in Collections:	Thesis

Files in This Item:

381601.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.