標題: 國語基本音節的頻域轉換
A Study of Spectral Conversion of Mandarin Base-Syllables
作者: 何宗仁
Tzong-Ren Ho
張文輝
Wen-Whei Chang
電信工程研究所
關鍵字: 語音轉換;對映函數;正弦分析合成;主成分分析;國語基本音節;voice conversion;mapping function;sinusoidal analysis-synthesis;principal component analysis;Mandarin Base-Syllalbes
公開日期: 2001
摘要: 語音轉換的效能取決於其對映函數是否能充分對映兩語者之特徵參數。前人研究乃基於向量碼書對映之技術,但因其存在量化失真,使其語音轉換效能降低。本論文則以連續機率模型來描述語音特徵參數,分別基於統計分析以及高斯混合模型之觀點,以求取最佳對映函數。鑑於對映函數的訓練需要大量的語料量,而語音特徵參數彼此間之相關性極高,因此引入主成分分析作維度降低的預處理,有效地降低對映函數訓練所需之語料量,同時提昇模型訓練之收斂速度。本論文更進一步分析國語基本音節的發聲特徵,對不同音類的語音設計其最佳對映函數,並將其結果應用在聽障者之發聲矯正上。根據實驗結果顯示,在發聲缺陷最為嚴重的擦音及塞擦音之矯正上,其效果斐然。
The performance of voice conversion depends on the mapping function with the aim to convert the characteristic features from the source speaker to the target speaker. Previous research is based on vector cookbook mapping, but the converter’s performance is degraded due to the quantization noise. To overcome this limitation, we proposed two mapping functions based on continuous probabilistic models. One is based on a statistical model, and the other is based on a Gaussian mixture model. To save that the training data, we exploit the high correlation of speech characteristic features, and employ the principal component analysis to reduce the dimension of characteristic features. Simulation results indicate that the proposed mapping function helps to enhance the hearing-impaired speech, especially the fricatives and affricates.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT900435059
http://hdl.handle.net/11536/68936
顯示於類別:畢業論文