電影轉換為漫畫之自動化系統

標題:	電影轉換為漫畫之自動化系統 Automatic movie comicization
作者:	高煒翔陳玲慧 Kao, Wei-Hsiang Chen, Ling-Hwei 多媒體工程研究所
關鍵字:	關鍵畫面;說話者辨識;卡通化;keyframe;speaker identification;cartoonization
公開日期:	2015
摘要:	隨著資訊科技蓬勃發展，行動裝置的使用日漸普及，人們也在不知不覺中養成了在這些裝置上欣賞"故事"的習慣，這些"故事"的形式主要分為影片、漫畫及小說，而其中以裝置的限制與效率來說，漫畫算是比較適合使用這些裝置觀賞的瀏覽方式。然而漫畫的製作過程相當繁雜，且需要人力介入的部份非常的多，因此自動化產生一部漫畫的需求便逐漸增加，而在上述的另外兩種形式中，影片的結構跟漫畫比較相近，所以本論文研製如何將一部影片轉換成漫畫的形式呈現。影片和漫畫主要的差別在於內容以及視覺效果的呈現，影片中一句話可能有數十個畫面，而漫畫一格通常只含一句話，且一頁大概只有五到六個畫格，因此如何用有限的畫格來傳達影片中的訊息是第一個課題；另外，由於影片中有聲音及字幕來做輔助，觀眾們藉此可以很輕易地知道目前對話的內容及身分，然而，在漫畫的形式中，這樣的資訊需要用對話方塊（俗稱氣球）包覆著對話的內容並放置在說話者的附近來傳遞，所以對話內容與說話者身分之間的對映便成了第二個課題；最後一個需要解決的課題是影片及漫畫中視覺呈現上的差異，這之中包含了氣球的製作及放置、畫面風格的轉換以及畫格的編排。本論文致力研究上述三個課題：（１）關鍵畫面的擷取（２）台詞與人臉的對映（３）視覺呈現的風格轉換。本篇使用了影片結合文字（劇本及字幕）的資訊來解決前兩個問題，並研究了一些有關視覺風格及漫畫設計的準則來解決第三個問題，以一些影片的片段來做測試，提供了一套自動化地將影片轉換為漫畫形式來做呈現的系統。 With Booming development of information technology, the using rate of mobile devices are getting higher and higher. People unknowingly get used to tasting "story" on these devices. The presentation of story is in the following ways, video, novel and comic. From the perspective of efficiency and limit on devices, "comic" is the most suitable way on these devices. However, the creation of comics is quite complicated and m needs many manual works. The demand for automatic production of comics is increasing gradually. Since "video" is similar with "comic" in structure, we study how to turn a video into comic in this work. The main difference between a video and a comic is the presentation of contents and visual effects. In a video, there may be dozens of frames to convey a sentence. But in a comic, there is usually only a sentence in a panel, and there are about five or six panels in a page. So how to convey the information in the video with limited panels is the first issue. In addition, the audience can obtain the content of conversation and the identity of the speaker easily with sound and subtitle when watching video. But in comic, the information is expressed with "balloon". Therefore, how to match the content of conversation with the speaker is the second issue. The last issue is to solve the difference in visual effects which consist of the generation and placement of balloon, frame stylization and panel layout. This work will dedicate to study the above three issues in, (1) key-frame extraction; (2) speaker identification; (3) catoonization. The first and second issues will solved by combining the information of video with text (containing script and subtitle), and the third issue will be handled based on some principles of comic design. Some video clips are taken as testing data to show the effectiveness of the proposed system, which turns a video into a comic automatically.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070256624 http://hdl.handle.net/11536/143425
Appears in Collections:	Thesis