產生性對抗式網路於無配對條件之圖片與文字間轉換

標題:	產生性對抗式網路於無配對條件之圖片與文字間轉換 Unpaired Translation between Images and Texts Using Generative Adversarial Networks
作者:	翁慶年莊仁輝李嘉晃劉建良 Wong, Ching-Nian Chuang, Jen-Hui Lee, Chia-Hoang Liu, Chien-Liang 資訊科學與工程研究所
關鍵字:	類神經網路;深度學習;產生性對抗式網路;Neural networks;Deep learning;Generative adversarial networks
公開日期:	2017
摘要:	圖片與文字之間的轉換，可以被視為是在電腦視覺領域以及自然語言處理領域中的兩種任務的組合：根據文字來產生對應的圖片，以及根據圖片來產生對應的文字。在傳統的監督式學習演算法中，我們不只需要標注，也需要標注與樣本之間的配對資訊，來讓傳統的監督式演算法能夠學習圖片與文字標注的對應關係。傳統的監督式學習演算法所接受的資料，可以是每一個樣本配對一個標注，或在很多情況中也可能是一個樣本配對多個標注。且多標注分類問題一直為學者關注的研究領域之一。然而，標注是耗費時間的，尤其在許多情況下標注與標注和樣本之間的配對資訊可能無法取得。本論文著重於處理在缺乏配對資訊的情況下進行學習。在缺乏配對資訊的情況下，圖片與文字之間的轉換任務，能被視為一種學習介於兩種不同資料集之間的潛在關係的任務，特別注意的是，其中一個資料集表達方式為連續數值，而另一個資料集的表達方式為離散數值。我們提出一個以對抗式訓練方式訓練的模型來處理這個任務，並且展示我們的模型，在沒有使用配對資訊來訓練的情況下，可以以文字來描述圖片的特徵，也能夠根據文字而產生具有文字所描述之特徵的圖片。 Translation between images and texts could be regarded as a combination of two tasks: generating images conditioned on texts, and generating texts conditioned on images. In traditional supervised learning algorithms, we need not only the labels but also the pair information between the samples and the labels to learn the relations between images and corresponding text labels. Moreover, traditional supervised learning algorithms allow a single label for each sample, but multi-label outcomes also occur in many applications settings, explaining why multi-label classification has caught the attention of researchers over decades. It is apparent that labeling is a time-consuming and label-intensive task. In particular, the labeling and pairing information may be unavailable in many settings. This thesis focuses on the condition, in which pair information is absent from the data. The task of translation between images and texts without pair information could be considered a task to learn the implicit relationship between two different dataset, where one is in continuous field and the other one is in discrete field. We propose a model to deal with this task, and demonstrate that our proposed model trained without pair information could describe images by attribute tokens and generate images according to attribute tokens.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456008 http://hdl.handle.net/11536/142357
Appears in Collections:	Thesis