中文語音轉換歌聲：基於WORLD系統之語音端點與音高模型整合

Title:	中文語音轉換歌聲：基於WORLD系統之語音端點與音高模型整合 Mandarin Speech to Singing：The Integration of Speech Segmentation and Pitch Model Based on WORLD
Authors:	湯宇辰黃志方成維華 Tong, Yu-Chen Huang, Chih-Fang Chieng Wei-Hua 工學院聲音與音樂創意科技碩士學位學程
Keywords:	語音端點偵測;WORLD;語音轉歌聲;Speech Segmentation;WORLD;Speech to Singing
Issue Date:	2016
Abstract:	現今網路媒體的發達，以及數位音樂的進步，許多音樂創作者只需要一台電腦和簡單的相關硬體設備，就可以在網路分享自己的創作或是創意。近年來，媒體娛樂發展出以虛擬歌手為宣傳形象，受到許多年輕人歡迎，並發展出特有的次文化。本論文主要以歌聲合成相關的研究為主軸。大部分的歌聲合成系統，都是先製作高品質語音的語料庫，再從語料庫中挑出需要的語音，送入系統去進行調整、合成，再串接在起來。本研究內容是探討一般大眾簡單錄製的語音來源，即品質比較低的語音，藉由WORLD語音分析與合成系統，分析出相關語音參數，並調整參數值後，轉換成像是在唱歌的歌聲。有些人聲合成系統如：VOCALOID，是手動調整參數變成歌聲，自然表現自然很好，但很費時，又需要技術，而有些相關研究是利用機器學習的方式自動調整，雖然表現也很好，但需要收集更龐大的資料去訓練。故此研究在參數調整方面，利用簡單而自動的方法調整音高的曲線，以達成製作簡單的歌聲也能有不錯的表現。 Today Internet media is well developed, and the progress of digital music, therefore many music creators share their creations or ideas on the web, which only requires a computer and simple equipment. In recent years, the development of the entertainment media to promote its image as a virtual singer has been rapidly made. It is popular for young people, and develops a unique subculture. This paper is mainly to research about the singing synthesis. Most of singing synthesis systems are using of the high-quality voice corpus firstly, and then pick a segment of voices from the corpus required, into the system to be modulated, synthesized, and then put them in series. This study is to investigate the contents of the general public simply recorded voice sources, namely a relatively low-quality voice, with WORLD, a speech analysis and synthesis system, and to analyze the speech related parameters with proper parameters adjustment, to convert to singing voice like human. Some sound synthesis systems such as: VOCALOID, manually adjusting the parameters into song, although the singing sounds good, it is time-consuming and requires more techniques. Some research is using of machine learning to automatically adjust the way, although the performance is good, it needs to collect more data to train the parameters. Therefore the research is a simple method to adjust the pitch curve to achieve the production of a simple song which can have a pretty good performance too.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070351908 http://hdl.handle.net/11536/138920
Appears in Collections:	Thesis