在結構貼圖上的視覺行為模型

標題:	在結構貼圖上的視覺行為模型 Visual Attention Modeling on Structural Textures
作者:	游軒雅 Yu, Hsuan-Ya 林文杰 Lin, Wen-Chieh 資訊科學與工程研究所
關鍵字:	感知;紋理合成;眼動儀;視覺注意力;perception;texture synthesis;eye-tracker;visual attention
公開日期:	2010
摘要:	紋理合成被廣泛的使用在電腦圖學以及影像合成的領域中；然而卻很少有研究從人類視覺系統(HVS)的方向來評估紋理合成的結果。從人類視覺系統方向來評估的方法，實際上包含注意力(attention)以及認知(cognition) 這兩個複雜的過程。在實做上，因為注意力是視覺訊息處理的第一步，因此我們著重在注意力這個部分。我們提出了一個在結構貼圖上的視覺注意力計算模型，此模型是建立在當人類受測者在評估一張紋理合成貼圖的結果時的凝視行為。我們的模型模擬了人類視覺系統中自下而上(bottom-up) 的過程以及自上而下(top-down)的過程；其中前者是去提取貼圖中低層次的結構特徵，而後者是建立在可以學習紋理合成結構特徵以及人類凝視點之間關聯的類神經網路。我們將我們的成果和被普遍拿來當作自然圖片的視覺注意力模型的顯著圖(Saliency Map)相比。在44 張受測試的貼圖中，我們的模型正確的預測了79%的凝視點位置，而顯著圖只預測了55%的凝視點位置。我們的模型對於紋理合成的算法有很大的幫助，因為它可以有效的將運算資源分配給比較吸引人類觀察者注意的區域。 Synthetic textures are widely used in the fields of computer graphics and image processing; however, there is less work addressing the problem of evaluating the quality of synthetic textures from the perspective of human visual system (HVS). The evaluating task actually involves complicate processes of attention and cognition. To make the problem manageable, we focus on the attention modeling problem in this thesis as attention is the first step in visual information process. We proposed a computational model of visual attention on structural textures by analyzing human subjects gaze behaviors when they judge the quality of synthetic structural textures. Our model simulates the bottom-up process and the top-down process in the HVS, where the former is done by extracting low-level structural features from a texture while the latter is performed on a neural network that learns the association between the structural features and the fixations of human subjects in our training set. We compared the performance of our model with the saliency map, which is the most popular computational model of attention for natural images.In 44 tested textures, our model correctly predicts 79% of fixation positions while the saliency map only achieves 55%. Our model is very useful for guiding texture synthesis and manipulation algorithms and rendering algorithms to efficiently allocating computational resources to those likely regions that humans would pay attention to.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT079755634 http://hdl.handle.net/11536/45978
Appears in Collections:	Thesis