標題: | 以二維影像與漸進式相似度外觀圖解法為基礎之穩健三維物體辨識 Robust 3D Object Recognition using 2D Views via an Incremental Similarity-Based Aspect-Graph Approach |
作者: | 蘇宗敏 Tzung-Min Su 胡竹生 Jwu-Sheng Hu 電控工程研究所 |
關鍵字: | 三維物件辨識;人形姿態辨識;場景辨識;外觀圖解法;背景濾除;高斯混合模型;3D Object Recognition;Human Posture Recognition;Scene Recognition;Aspect-Graph;Background Subtraction;Gaussian Mixture Model |
公開日期: | 2007 |
摘要: | 本論文提出了一套使用二維影像的穩健三維物體辨識架構。在此架構中包含了兩個主要部份,第一部份是前處理的部份,用來抽取出二維影像中的前景物體,以作為後續的學習與辨識之用。第二部份是一套漸進式資料庫建立方法,利用從不同角度所拍攝到的三維物體之二維影像來建構出該三維物體資料庫,並且能夠利用新拍攝到的二維影像來更新已建構好之三維物體資料庫。
在前處理的部份,我們提出了一套包含強光與陰影濾除的背景濾除架構(BSHSR),使得前景物體在在光影變化與動態背景的影響下,依然能夠精確的被萃取出來。BSHSR中包含了三個模型,分別是以色彩為基礎的機率背景模型(CBM)、以CBM為基礎的梯度機率背景模型(GBM),以及一個圓錐形的光影模型(CSIM)。CBM是利用高斯混合模型(GMM)針對每個像素的像素值作統計所建構出來的模型。而根據CBM,又可以建構出短期背景模型(STCBM)與長期背景模型(LTCBM),接著再利用STCBM與LTCBM建構出GBM。而為了區別前景、強光與陰影的不同,本研究中提出了一建構在RGB色彩空間中且具有動態錐形邊界的CSIM。在漸進式資料庫建立方法的部份,我們提出了一套以相似度外觀圖解法為基礎的學習架構(ISAG)。利用相似度外觀圖解法,每個三維物體在資料庫中均可用一組外觀(aspect)來表示,而每一個外觀則包含了數目不一的二維影像,並且用一個特徵面(characteristic view)來代表。本研究所提出的漸進式資料庫建立方法,目的在於提高屬於同一外觀的二維影像彼此之間的相似度,並且降低各個特徵面彼此之間的相似度。此外,為了模擬人類認知物體的能力,我們採用隨機取樣之角度所拍攝的三維物體之二維影像來做為訓練影像,隨著所收集到的二維影像數目增加,該三維物體的資料庫也會隨之更新。最終,本論文先以實際複雜環境中所拍攝的數段影片之實驗結果來說明所提出的BSHSR之可行性,接著將BSHSR應用於三維物體辨識架構中,以抽取出二維影像中之前景物體。而為了驗證所提出三維物體辨識架構之優越性,我們利用ISAG搭配物體的形狀與色彩特徵,將之應用於三種不同的三維物體之問題,分別是剛體辨識、人形姿態辨識與場景辨識,並根據辨識率結果來說明所提出的三維物體辨識架構之可行性。 This work presents a framework for robust recognizing 3D objects from 2D views. The proposed framework comprises of two stages: the pre-processing stage and the incremental database construction stage. In the pre-processing stage, foreground objects is extracted from 2D views and applied for building 3D database and recognizing. In the incremental database construction stage, a 3D object database is built and updated using 2D views randomly sampled from a viewing sphere. A background subtraction scheme involving highlight and shadow removal (BSHSR) is proposed as the pre-processing stage of the framework. Foreground regions can be precisely extracted from 2D views using the BSHSR despite illumination variations and dynamic background. The BSHSR comprises three models, called the color-based probabilistic background model (CBM), the gradient-based version of the color-based probabilistic background model (GBM) and a cone-shape illumination model (CSIM). The Gaussian mixture model (GMM) is applied to construct the CBM using pixel statistics. Based on the CBM, the short-term color-based background model (STCBM) and the long-term color-based background model (LTCBM) can be extracted and applied to build the GBM. Furthermore, a new dynamic cone-shape boundary in the RGB color space, called the CSIM, is proposed to distinguish pixels among shadow, highlight and foreground. An incremental database construction method based on similarity-based aspect-graph (ISAG) is proposed for building the 3D object database using 2D views. Similarity-based aspect-graph, which contains a set of aspects and characteristic views for these aspects, is employed to represent the database of 3D objects. An incremental database construction method that maximizes the similarity of views in the same aspect and minimizes the similarity of prototypes is proposed as the core of the framework. To imitate the ability of human cognition, 2D views randomly sampled from a viewing sphere are applied for building and updating a 3D object database. The effectiveness of the BSHSR is demonstrated via experiments with several video clips collected in a complex indoor environment. The BSHSR is applied in the proposed framework to extract foreground object from 2D views. The proposed framework is evaluated on various 3D object recognition problems, including 3D rigid recognition, human posture recognition, and scene recognition. Shape and color features are employed in different applications with the proposed framework to show the efficiency of the proposed method. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT008912528 http://hdl.handle.net/11536/77035 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.