標題: 基於學習之階層式圖模型及其於 影像分層及深度重建之研究
Learning based Hierarchical Graph for Image Matting and Depth Reconstruction
作者: 曾禎宇
Tseng, Chen-Yu
王聖智
Wang, Sheng-Jyh
電子工程學系 電子研究所
關鍵字: 影像分層;圖模型;深度重建;image matting;graph model;depth reconstruction
公開日期: 2014
摘要: 本論文提出一套階層性的架構以建立matting Lapalcian。matting Laplacian原始是用於處理image matting的問題的一種圖模型,而image matting是一種影像前景物件擷取技術,其特質在於能夠精細的估測前景色彩的透明度。然而基於matting Laplacian的演算法運算量繁重,是實際應用上的一項主要障礙;另一方面,由於matting Laplacian本身是一種圖模型,如何建構圖之節點與節點連接性將是一項相當重要的議題。在本論文中,一方面將會討論如何降低matting Laplacian 的運算量,另一方面也將探討圖節點之建立與節點間連接性的估測。 在本論文中,我們提出一套有效率的結構階層式的將影像資料轉為圖形節點。在此階層性架構中,我們會探討圖模型的節點建立與連接節點的鍵結建立方法。在節點建立的部份我們會探討如何將影像資料由像素點(Pixel)凝聚為小聚落(cell),我們提出一套像素收縮演算法,透過像素階層的圖模型作用將相似的像素進行凝聚,收縮圖中連結性強之像素點間的距離,接著再透過網格將收縮後的像素轉為聚落。當資料點由像素轉為聚落之後,資料點數將大幅度降低以提升後續運算之效能。另一方面,由於節點間的鍵結是影像分析效果的關鍵,我們提出一套多尺度鍵結建立機制,從多重解析度影像區塊中學習局部之圖模型結構,將局部性建構之圖形彙整而構成全域性之圖模型。透過這樣的學習機制,我們能夠針對不同應用之影像資料學習影像結構特質,而多尺度的分析方法能夠有助於提升後續分析之精確度。 根據所建立的圖模型,本論文進一步提出兩種以此圖模型為基礎之運算,分別為圖模型之拆解與資訊傳遞。基於圖模型之拆解,我們提出一套非監督性影像分層(image matting)技術,此技術能夠自動將影像拆解並分析多重前景圖層之可能性。另一方面,基於圖模型之資訊傳遞,我們發展了一套深度影像重建技術,透過資訊傳遞取得全域性最佳化之深度重建結果。 於影像影像分層(image matting)應用方面,我們將展現多尺度圖形鍵結方法於分析準確度之提升,由於我們的模型能夠從多重解析度影像中學習不同尺度之影像結構,這樣的方法能夠突破傳統局部分析之障礙,提升分析複雜影像之可靠度。另一方面,我們展現所提出的階層性結構於運算效能之提升。為了分析多重前景圖層,我們進一步在提出的階層性架構中引入最上層的成分階層圖模型,透過高階的圖模型分析matting成分組合之機率性,進一步估測前影圖層之可能性。相較於傳統技術通常僅止於前後景二元化分析,本技術提出了多重圖層之分析方法以詮釋多重前景物件。 在資訊傳遞運算方面,本論文提出一套基於多重對焦之深度估測應用技術,這項技術是基於一套後述機率最大化Maximum-A-Posteriori (MAP)架構,其中我們會將從多重對焦影像建構小聚落圖模型,以圖模型建立空間連續性模型。將此連續性模型引入MAP架構中進行最佳化估測,此項技術能夠在少量的不同對焦影像中有效重建三維空間資訊。
This dissertation presents a hierarchical framework to construct matting Lapalcian. The matting Laplacian is a graph model originally proposed to deal with image matting problem. The image matting is a process to extract foreground objects from an image, addressing to estimate the opacity of the foreground colors. The proposed framework addresses how to construct the graph model in an efficient way for practical applications. Besides, the construction of the graph model for matting Laplacian may highly affect the accuracy of image matting. This dissertation proposes a new construction procedure, including the generation of graph vertices and edges, to achieve improved accuracy for the subsequent processes. In the proposed framework, an efficient scheme is presented which hierarchically compresses the image data into graph vertices. The proposed hierarchical framework progressively condenses image data from pixels into cells and present a cell-level matting Laplacian. We present a construction process to dramatically reduce the number of graph vertices. The computational loads of subsequent processes can be dramatically reduced. On the other hand, a learning procedure is proposed to construct the edges of the cell-level graph. The graph affinity is learned from a set of image patches by using a local-to-global scheme. For each image patch, a local sub-graph model is learned. The global graph model is then constructed by assembling all the local models. With this learning procedure, we derive a generalized graph model that can be used for various kinds of applications. With the cell-level graph, two fundamental graph-based processes, partitioning and propagation, are further addressed. Based on graph partitioning, we will present a new unsupervised matting method and use it to achieve multi-layer matting analysis. On the other hand, we will present the application of the graph-based information propagation process and proposed an application of the 3-D depth reconstruction. For the unsupervised image matting, we demonstrate that the learning procedure of the construction of graph edges can improve the performance of graph decomposition. Finally, in order to further analyze the foreground layers, we introduce the final stage of the hierarchy, the component level. After we derive the matting components, a component-level graph model is presented to assemble the components into foreground layers. Unlike conventional approaches which typically address binary foreground/background partitioning, the proposed method provides a set of multi-layer interpretations for unsupervised matting. Experimental results show that the proposed approach can generate more consistent and accurate results as compared to state-of-the-art techniques. To demonstrate the graph-based information propagation process with the cell-level graph, this dissertation proposes a new depth reconstruction process for depth reconstruction in the shape-from-focus (SFF) process. A Maximum-A-Posteriori (MAP) framework is presented to incorporate a spatial consistency prior model. The spatial consistency model is based on a graph model learned from multi-focus sequence. By solving a MAP estimation problem with the inclusion of a spatial consistency model, the proposed method can achieve quite impressive performance even only with the use of a few image frames.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079611822
http://hdl.handle.net/11536/75505
顯示於類別:畢業論文