标题: | 语音转换及其在异常发声矫正之应用 Voice Conversion with Application to Enhanced Intelligibility of Hearing Impaired |
作者: | 李承龙 Cheng-Long Lee 张文辉 Wen-Whei Chang 电信工程研究所 |
关键字: | 语音转换;正弦语音模型;voice convertion;sinusoidal speech model |
公开日期: | 2000 |
摘要: | 语音转换是要把音源语者的声音藉由参数转换机制转变成目标语者的声音,故首要步骤是先分析出代表个人特质的特征参数。特征参数转换机制则用基于高斯混合模型所设计之线性对映函数,而函数中的参数则以估计理论的技术来求得。对映函数将使得音源语者的特征参数经转换后和目标语者的特征参数之间有最小的失真量,再配合正弦语音模型之谐波合成技术来合成语音讯号。研究结果显示出针对正常人而言,语音转换处理采用巴克频谱会比倒频谱系数合成的音质更佳。于主观的听力测试中,使用巴克频谱作转换的语料中有84%是被肯定的,比倒频谱系数多了5%。此外,本论文之另一研究主题是要运用语音转换的技术来设计听障者发音矫正之辅具。实验结果显示基于倒频谱系数而进行的语音转换处理,可有效提升听障者发音的可理解程度。 Voice conversion is aimed to modify the speech signal of source speaker so that it sounds as if it was uttered by target speaker. The basic strategy is the detection and exploitation of characteristic features that identify speaker individuality. This was done by decomposing the speech waveforms into a sum of sinusoids. Sine-wave amplitudes are used to determine the spectral envelope which is then characterized under the form of a Gaussian mixture model. Characteristic features are modified by a mapping function that minimizes the spectral distortion between source and target speakers uttering the same text. The mapping function is performed by a linear transformation with parameters trained by a joint estimation algorithm. Experimental results indicate that the Bark spectrum is preferred to the cepstrum for use in voice conversion between two normal-listening speakers. The second part of this study presents a novel means of exploiting voice conversion in the design of speaking aids for the hearing-impaired. Experimental results indicate that the cepstrum-based voice conversion system appears useful in enhancing the intelligibility of the impaired speech. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT890435017 http://hdl.handle.net/11536/67297 |
显示于类别: | Thesis |