基於旋積類神經網路的影像搜尋應用中設計具鑑別能力的影像區塊挑選法

標題:	基於旋積類神經網路的影像搜尋應用中設計具鑑別能力的影像區塊挑選法 Discriminatively-learned Patch Selection for Image Retrieval with CNN
作者:	古韋麟 Ku, Wei-Lin 彭文孝 Peng, Wen-Hsaio 多媒體工程研究所
關鍵字:	影像檢索;視覺搜尋;旋積類神經網路;物件辨識;Image Retrieval;Visual Search;Convolutional Neural Network;Object Recognition
公開日期:	2015
摘要:	為了提升的影像搜尋的精確度，本論文提出一個具鑑別能力的影像區塊挑選法，以利後續旋積類神經網路(CNN)的特徵萃取並形成一個更強化的全域影像表示法。本論文的全域影像表示法架構以MOP-CNN 為基礎，我們把先它的滑動視窗影像區塊偵測換成以物件為導向的偵測法，以取得更多有用並且具鑑別力的資訊。接著，為了要捨棄多餘的影像區塊(帶有過多雜訊)，我們設計了一種針對影像搜尋具鑑別力的影像區塊挑選法。經過篩選後的影像區塊被輸入至旋積類神經網路以萃取影像特徵。而這些多個特徵可以通過一個簡單的最大匯集 (Max-pooling)運算以形成單一個全域影像表示法。實驗結果顯示經過所設計影像區塊挑選法所形成的全域影像表示法在四個著名的影像搜尋資料庫贏過 MOP-CNN以及以CNN為全域特徵擷取的表示法。 In this thesis, targeting at advancing image retrieval tasks, we propose a discriminatively-learned patch selection process that improves the global image representation with local-CNN-features. Starting from MOP-CNN, we replace its sliding-window mechanism with the objectness estimator for producing local image patches with more useful and discriminative information. Then, to further discard noisy patches, we propose to incorporate a discriminative patch selection mechanism. Strictly selected patches are passed through the CNN to extract discriminative features, from which the simpler MAX-pooling can be performed on those features. The result is a global image representation with more discriminative power and even shorter feature dimension. Experimental results show that when compared with MOP-CNN and CNN as a global feature extractor, our proposed framework brings significant performance benefits on four retrieval datasets.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT070156644 http://hdl.handle.net/11536/126775
Appears in Collections:	Thesis