標題: 基於語義分割之影片風格轉換
Video Style Transformation Based on Semantic Segmentation
作者: 張榮傑
施仁忠
Chang, Rong-Jie
Shih, Zen-Chung
多媒體工程研究所
關鍵字: 風格轉換;語意分割;動作估計;深度學習;Style transfer;Semantic segmentation;Motion estimation;Deep learning
公開日期: 2017
摘要: 使用深度學習所做的風格轉換在影片方面的應用大部份只使用一張風格圖來做特徵的擷取並合成,就算使用兩張以上的風格圖也經常是沒有規則的融合在一張圖裡,這樣的設計的確保留影片的風格一致性但卻減少了使用者的創意和選擇性。因此我們將前景物體和背景分開,並個別進行不同風格的轉換。我們使用全卷積神經網路來進行語義分割,我們除了增加所分割物體的可靠性,也利用所分割的資訊和前後景的關係來反覆地改善分割結果。在影片連續性上面,我們也使用分割的結果來加強光流,依照前景和背景的特性採用不同的動作估計方法。改善了光流中運動邊界不準確的問題,也用光流來減少因阻擋,物體變形而分割不正確和不連續的問題。
Most applications about artistic style transfer for videos, which based on deep learning, only extract features from a single style image to do texture synthesis. Even if they use two style images, the results still blend without rules. This design preserves the style to be identical in the whole video, but it loses the creativity and selectivity for users. Therefore, in this thesis we segment foreground objects and background, and then apply different styles respectively. We use a fully convolutional neural network to perform semantic segmentation. We increase the reliability of the segmentation, and use the information of segmentation and the relationship between foreground objects and background to improve segmentation iteratively. We also use segmentation to improve optical flow, and apply different motion estimation methods between foreground objects and background. This improves the motion boundaries of optical flow, and solves the problems of incorrect and discontinuous segmentation caused by occlusion and shape deformation.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456631
http://hdl.handle.net/11536/142599
Appears in Collections:Thesis