標題: 應用不變量澤爾尼克矩描述元進行影像之表示、比對及辨識
Image Representation, Matching, and Recognition Using Invariant Zernike Moment Descriptors
作者: 孫樹國
Sun, Shu-Kuo
陳稔
Chen, Zen
資訊科學與工程研究所
關鍵字: 澤爾尼克矩;影像表示與比對;區域描述元;幾何及影像亮度轉換;相位;影像套合;Zernike moments;image representation and matching;Region descriptors;geometric and photometric transformations;phase;View registration
公開日期: 2008
摘要: 本論文在探討三維電腦視覺中,利用二張或更多張不同視度或不同光照條件下所拍攝的景物影像來進行景物分析、辨識及套合等研究,所需克服影像間存在之幾何轉換 (旋轉、尺度變化、平移及幾何變形)、影像亮度轉換 (影像模糊、照度改變、雜訊、影像壓縮等)、部分遮蔽以及影像套合計算效率等問題。 首先,我們提出一個基於澤爾尼克矩相位資訊為主的不變量區域描述元,同時包含精確估算二個特徵區域間旋轉角度的方法來解決旋轉方位對齊的問題,以及一個可以達到高可靠度的比對函式。整體而言,在上述不同的幾何及影像亮度轉換下,這個新的澤爾尼克矩相位描述元較諸目前五個主要方法具有更高的區辨能力,論文中亦包含定性及定量分析來說明這些描述元效能差異的原因。 其次,我們將這個區域描述元延伸到行動裝置服務之商標符號辨識上,它可使用於企業識別、公司網頁存取、交通安全號誌辨認及安全檢查等相關應用上,在此主要的挑戰是行動裝置拍攝影像時所無法避免的幾何及影像亮度轉換,我們提出二種相似度量測方法分別用於分類及檢索上,實驗顯示我們提出的方法較之既有的三個主要方法具有更好的效能。 最後,我們提出一個不同於傳統之影像套合方法,更有效率的達到不同視點影像套合所需之一對一特徵點對應,此方法是基於事先分析參考影像以獲取重要的資訊來引導影像套合程序之進行。首先,在離線階段先針對參考影像中的特徵點根據下述五個規劃策略來事先建立挑選順序之資料庫: (1)特徵點對影像變形之不變量、(2)對影像雜訊之抵抗力、(3)描述元之區辨能力、(4)模型估算之有效性及(5)影像部份重疊之處理能力。因此,當我們獲得感測影像進行影像套合時,即可更有效率的建立這二張影像間之特徵點一對一對應關係,來估算這二張影像的轉換模型。
In 3D computer vision a scene in the real world is represented by multiple views imaged under different viewpoints and illumination conditions. The spatial and temporal relationships across these views are important to scene analysis and understanding. To derive these relationships the global and local features of the objects (foreground and background) in the scene are the clues. The local features related to the local object surface patches or regions are more robust to viewpoint change than the global features. In addition, the invariance under the photometric transformations such as blur, illumination, scale, noise, JPEG compression is also receiving great attention. In this dissertation subjects related to the local image representation, matching, and recognition under the above image variations are addressed. First, a new distinctive image descriptor to represent the normalized regions extracted by an affine region detector is proposed which primarily comprises the Zernike moment (ZM) phase information. An accurate and robust estimation of a possible rotation angle between a pair of normalized regions is then described, which will be used to measure the similarity between two matching regions. The discriminative power of the new ZM phase descriptor is compared with five major existing region descriptors based on the precision-recall criterion. The experimental results involving more than 15 million region pairs indicate the proposed ZM phase descriptor has, overall speaking, the best performance under the common photometric and geometric transformations. Both quantitative and qualitative analyses on the descriptor performances are given to account for the performance discrepancy. Second, the proposed ZM phase descriptor is further extended to present a new recognition method of logos imaged by mobile phone cameras. The logo recognition can be incorporated with mobile phone services for use in enterprise identification, corporate website access, traffic sign reading, security check, content awareness, and the related applications. The main challenge to applying the logo recognition for mobile phone applications is the inevitable photometric and geometric transformations. The proposed ZM phase recognition method is associated with two similarity measures. The logo classification and retrieval experimental results show that the proposed ZM phase method has the best performance under the typical photometric and geometric transformations, compared with other three major existing methods. Finally, as for the one-to-one feature matching correspondences in view registration, we propose an efficient registration method different from the traditional methods. We take advantage of preprocessing of the reference image offline to gather the important statistics for guiding image registration. That is, we introduce five planning strategies to sort the feature points in the reference image based on the concepts of (1) feature invariance to image deformation, (2) image noise resistance, (3) distinctive description power, (4) model estimation effectiveness, and (5) partial image overlapping handling capability. Thus, a reference matching database is constructed offline using the above five planning strategies. Then, an online registration process is presented to estimate the transformation model to overlay the reference image over an incoming sensed image. In this way, better registration efficiency can be achieved.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008817810
http://hdl.handle.net/11536/61335
Appears in Collections:Thesis


Files in This Item:

  1. 781001.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.