基於深度學習的適應性特徵映射應用於個人化表情辨識

標題:	基於深度學習的適應性特徵映射應用於個人化表情辨識 Deep Learning Based Adaptive Feature Mapping Approach to Personalized Facial Expression Recognition
作者:	林俊賢吳炳飛 Lin, Chun-Hsien Wu, Bing-Fei 電控工程研究所
關鍵字:	跨域適應;適應性學習;深度學習;表情辨識;回歸;Cross domain adaption;Adaptive learning;Deep learning;Facial expression recognition;Regression
公開日期:	2016
摘要:	在人機介面上，自動化表情辨識系統是一個重要的功能，一旦機器可以讀懂人的情緒的話，那麼它就可以提供一些貼心的服務，而這是在這個智慧時代中相當重要的課題。近日，由於深度學習在巨量資料訓練後的表現十分出色，因此很多方法都是基於深層神經網路所實現的。然而，如果深層模型沒有訓練過在一些特定場合的資料的話，那麼它的辨識能力就會因此受到限制，但是資料的標記是相當費時費工的。因此本研究主要提出如何利用未標記的新資料，將一個訓練過後的通用模型個人化的方法。我們提出適應性特徵映射的方式將新資料的特徵分佈轉換到舊資料的特徵分佈上. 藉由將每一個新資料與其最相近的舊資料之前的誤差最小化，那些在易混淆邊界上的新資料特徵，就會被牽引至靠近群落中心的位置，如此一來，它們預測錯誤的結果就有機會被修正。在此之前，我們希望可以訓練出一個通用且較穩健的深層模型，因此我們蒐集了23,591張表情影像來做為訓練資料，而絕大部分是由YouTube的影片上擷取下來加以標記的。為了讓模型得以有效學習，我們也做了空間校正與特徵強化，其中包含了平移校正、旋轉校正、Neighbor-center difference images (NCDIs)以及橢圓切割法。在五種資料庫上的測試下，在大部分的狀況下，適應性特徵映射都有助於提升通用模型應用於特定環境下的表現，由此可見，這個方法對於實現個人化辨識的深層模型是有相當的潛力的。 Automatic facial expression recognition is useful in human-machine interface. Machines can give the close services when knowing the human's emotion, which is important in this intelligent generation. Many deep learning approaches are employed in current year due to its outstanding accuracy as it is trained by large amount of data. The performance is limited, however, in the specific condition or population if the model is not trained by the new data under such environment. In fact, labeling the data is a hard work and time consuming. Hence, this paper addresses the problem of how to personalize the generic model without label information from the testing data. Adaptive feature mapping (AFM) is proposed to transform the feature distribution of new subjects into that of trained data. By means of minimizing the error between each testing sample and the most relevant trained sample, AFM can tow the testing samples near the confusing boundary to the centers of categories; therefore, their predicted labels can be corrected. To train a generic and robust deep model, 23,591 training images are extracted from YouTube mostly. Besides, to strengthen the learning efficiency, neighbor-center difference images (NCDIs) and ellipse cropping are utilized to enhance the features of the input image. After testing on five databases, AFM shows its ability to personalize the deep model and improves the performance in most of the cases. Therefore, AFM has the potential to realize the personal recognition base on deep learning.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070460018 http://hdl.handle.net/11536/140238
Appears in Collections:	Thesis