标题: 基于学习之阶层式图模型及其于 影像分层及深度重建之研究
Learning based Hierarchical Graph for Image Matting and Depth Reconstruction
作者: 曾祯宇
Tseng, Chen-Yu
王圣智
Wang, Sheng-Jyh
电子工程学系 电子研究所
关键字: 影像分层;图模型;深度重建;image matting;graph model;depth reconstruction
公开日期: 2014
摘要: 本论文提出一套阶层性的架构以建立matting Lapalcian。matting Laplacian原始是用于处理image matting的问题的一种图模型,而image matting是一种影像前景物件撷取技术,其特质在于能够精细的估测前景色彩的透明度。然而基于matting Laplacian的演算法运算量繁重,是实际应用上的一项主要障碍;另一方面,由于matting Laplacian本身是一种图模型,如何建构图之节点与节点连接性将是一项相当重要的议题。在本论文中,一方面将会讨论如何降低matting Laplacian 的运算量,另一方面也将探讨图节点之建立与节点间连接性的估测。
在本论文中,我们提出一套有效率的结构阶层式的将影像资料转为图形节点。在此阶层性架构中,我们会探讨图模型的节点建立与连接节点的键结建立方法。在节点建立的部份我们会探讨如何将影像资料由像素点(Pixel)凝聚为小聚落(cell),我们提出一套像素收缩演算法,透过像素阶层的图模型作用将相似的像素进行凝聚,收缩图中连结性强之像素点间的距离,接着再透过网格将收缩后的像素转为聚落。当资料点由像素转为聚落之后,资料点数将大幅度降低以提升后续运算之效能。另一方面,由于节点间的键结是影像分析效果的关键,我们提出一套多尺度键结建立机制,从多重解析度影像区块中学习局部之图模型结构,将局部性建构之图形汇整而构成全域性之图模型。透过这样的学习机制,我们能够针对不同应用之影像资料学习影像结构特质,而多尺度的分析方法能够有助于提升后续分析之精确度。
根据所建立的图模型,本论文进一步提出两种以此图模型为基础之运算,分别为图模型之拆解与资讯传递。基于图模型之拆解,我们提出一套非监督性影像分层(image matting)技术,此技术能够自动将影像拆解并分析多重前景图层之可能性。另一方面,基于图模型之资讯传递,我们发展了一套深度影像重建技术,透过资讯传递取得全域性最佳化之深度重建结果。
于影像影像分层(image matting)应用方面,我们将展现多尺度图形键结方法于分析准确度之提升,由于我们的模型能够从多重解析度影像中学习不同尺度之影像结构,这样的方法能够突破传统局部分析之障碍,提升分析复杂影像之可靠度。另一方面,我们展现所提出的阶层性结构于运算效能之提升。为了分析多重前景图层,我们进一步在提出的阶层性架构中引入最上层的成分阶层图模型,透过高阶的图模型分析matting成分组合之机率性,进一步估测前影图层之可能性。相较于传统技术通常仅止于前后景二元化分析,本技术提出了多重图层之分析方法以诠释多重前景物件。
在资讯传递运算方面,本论文提出一套基于多重对焦之深度估测应用技术,这项技术是基于一套后述机率最大化Maximum-A-Posteriori (MAP)架构,其中我们会将从多重对焦影像建构小聚落图模型,以图模型建立空间连续性模型。将此连续性模型引入MAP架构中进行最佳化估测,此项技术能够在少量的不同对焦影像中有效重建三维空间资讯。
This dissertation presents a hierarchical framework to construct matting Lapalcian. The matting Laplacian is a graph model originally proposed to deal with image matting problem. The image matting is a process to extract foreground objects from an image, addressing to estimate the opacity of the foreground colors. The proposed framework addresses how to construct the graph model in an efficient way for practical applications. Besides, the construction of the graph model for matting Laplacian may highly affect the accuracy of image matting. This dissertation proposes a new construction procedure, including the generation of graph vertices and edges, to achieve improved accuracy for the subsequent processes.
In the proposed framework, an efficient scheme is presented which hierarchically compresses the image data into graph vertices. The proposed hierarchical framework progressively condenses image data from pixels into cells and present a cell-level matting Laplacian. We present a construction process to dramatically reduce the number of graph vertices. The computational loads of subsequent processes can be dramatically reduced. On the other hand, a learning procedure is proposed to construct the edges of the cell-level graph. The graph affinity is learned from a set of image patches by using a local-to-global scheme. For each image patch, a local sub-graph model is learned. The global graph model is then constructed by assembling all the local models. With this learning procedure, we derive a generalized graph model that can be used for various kinds of applications.
With the cell-level graph, two fundamental graph-based processes, partitioning and propagation, are further addressed. Based on graph partitioning, we will present a new unsupervised matting method and use it to achieve multi-layer matting analysis. On the other hand, we will present the application of the graph-based information propagation process and proposed an application of the 3-D depth reconstruction.
For the unsupervised image matting, we demonstrate that the learning procedure of the construction of graph edges can improve the performance of graph decomposition. Finally, in order to further analyze the foreground layers, we introduce the final stage of the hierarchy, the component level. After we derive the matting components, a component-level graph model is presented to assemble the components into foreground layers. Unlike conventional approaches which typically address binary foreground/background partitioning, the proposed method provides a set of multi-layer interpretations for unsupervised matting. Experimental results show that the proposed approach can generate more consistent and accurate results as compared to state-of-the-art techniques.
To demonstrate the graph-based information propagation process with the cell-level graph, this dissertation proposes a new depth reconstruction process for depth reconstruction in the shape-from-focus (SFF) process. A Maximum-A-Posteriori (MAP) framework is presented to incorporate a spatial consistency prior model. The spatial consistency model is based on a graph model learned from multi-focus sequence. By solving a MAP estimation problem with the inclusion of a spatial consistency model, the proposed method can achieve quite impressive performance even only with the use of a few image frames.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079611822
http://hdl.handle.net/11536/75505
显示于类别:Thesis