标题: 基于深度卷积信念网路之人脸视角转换技术
Warping of Human Face View using Convolutional Deep Belief Networks
作者: 陈冠廷
Chen, Kuan-Ting
王圣智
简凤村
Wang, Sheng-Jyh
Chine, Feng-Tsun
电子工程学系 电子研究所
关键字: 深度学习;深度卷积信念网路;转换;人脸视角转换;deep learning;convolutional deep belief nets;warping;warping of human face view
公开日期: 2014
摘要: 这篇论文主要是在探讨如何从人脸影像中找到比较好的特征表现,并且藉由连接特征表现来建构有对应关系的人脸影像之连结,此连结被建立起来后将可以做到人脸视角的转换。此一做法的初始想法是因为我们发现当一个物体移动或是转动很缓慢时,这个物体的特征表现往往只会缓慢的改变。如果我们能够把物体特征改变的方式给学习下来并用模型去加以模拟,就可以利用这个改变的机制去做到人脸视角的转换。另一方面,我们使用深度卷积信念网路当成萃取影像特征的原因是因为它具有两个很重要的特性。第一,当物体移动时,用来代表此物体的特征也会随着物体移动而作空间上相对应的移动。第二,当物体在小范围空间内有些微的移动时,深度卷积信念网路仍然可以用相同的特征来描述此物体,并不会受这些细微变动的影响。在本论文中,我们使用的学习演算法是一种非督导式学习,这个方法叫做预先学习。经过预先学习后的模型会是一个生成模型,这个模型会是资料数据与隐藏的状态的交互机率分布。因此给予模型一张人脸的影像,这个模型将可以从与这笔资料有关联性之模型的隐藏状态中去转换出一个不同视角的人脸。
In this thesis, we aim at finding a better way of representing and connecting related human face images using the learning approach in convolutional deep belief networks (DBN). Since images are connecting with corresponding representations, it is possible for the convolutional DBN to infer a human face image with a view angle by a given image with the same human face from another view angle. The proposed methods are shown to work well due to the fact that the features detected on an image of an object in different movements are highly correlated. If patterns of feature changes could be modeled by the deep architecture, warping of human face view can be realized. Besides, the reason of using the convolutional deep belief nets as the feature extractor is that they have translated representation and translation invariant properties, which renders a more robust model to translated data. The proposed training algorithm is an unsupervised learning called pre-training. After pre-training, the model becomes a generative model specifying a joint distribution of all data and hidden states. Therefore, with a given image of human face, the model can infer a warping of the face from correlation in hidden states.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070150227
http://hdl.handle.net/11536/76508
显示于类别:Thesis