標題: | 含誤差成形技術與頻譜動態之線頻譜頻率量化 Incorporating Error Shaping Technique and Spectral Dynamics into LSF Vector Quantization |
作者: | 粘溪文 Hsi-Wen Nein 林進燈 Chin-Teng Lin 電控工程研究所 |
關鍵字: | 誤差成形技術;線頻譜頻率;頻譜動態訊息;人耳聽覺特性;加權式對數頻譜失真;加權式均平方誤差量測法;混合式激發源線預估;error shaping technique;LSF;the spectral dynamics information;perceptual property of human ear;weighted log-spectral distortion;weighted mean squared error;Mixed Excitation Linear Prediction, MELP |
公開日期: | 2000 |
摘要: | 本論文主要目的是利用誤差成形技術(error shaping technique)及頻譜動態訊息(the spectral dynamics information)來改善線頻譜頻率(LSF)量化器(quantizer)。誤差成形技術可藉由善用人耳聽覺特性(perceptual property of human ear)來減少線頻譜頻率量化後於聽覺上的失真量(distortion), 而頻譜動態訊息則可以改善線頻譜頻率量化(quantization)後的平滑(smooth)程度,藉以減少線頻譜頻率量化後的不連續失真,以增加線頻譜頻率量化器的量化效果(performance)。
誤差成形技術(error shaping technique)是以加權式對數頻譜失真(weighted log-spectral distortion)量測(measure)法為基礎,利用一個頻域加權函數(weighting function),將線頻譜頻率量化後之頻譜誤差分佈情況(spectral distortion distribution of quantization error)塑造成任意的頻率相依曲線(frequency-dependent curve)。但由於加權式對數頻譜失真量測法的計算複雜度(computational complexity)相當地高,因此我們將加權式對數頻譜失真量測法近似成計算量較簡單的二次加權式失真量測法(quadratically weighted distortion measure)及計算量更簡單的加權式均平方誤差量測法(weighted mean squared error),以避免基於加權式對數頻譜失真量測法的誤差成形技術無法於實際應用中使用。另外,我們亦在理論上對此加權式對數頻譜失真量測法加以分析,並找出用於加權式均平方誤差量測法的最佳(optimal)線頻譜頻率加權值(weights for LSF parameters)。
頻譜動態訊息於頻譜頻率(LSF)量化的使用上,是基於一個修改後的加權式對數頻譜失真量測法(modified weighted log-spectral distortion measure)。這個修改後的加權式對數頻譜失真量測法不但擁有加權式對數頻譜失真量測法於塑造頻譜誤差分佈曲線的能力,而且同時可以用來減少線頻譜頻率量化後所造成的頻譜動態失真(spectral-dynamics distortion between quantized spectra and unquantized spectra)。換句話說,在線頻譜頻率量化器的設計上,此修改後的加權式對數頻譜失真量測法可以藉由一個頻域加權函數的使用,於設計之時同時考慮量化前後的頻譜失真(spectral distortion)及頻譜動態失真(spectral-dynamics distortion)。然而,此修改後的加權式對數頻譜失真量測法亦有加權式對數頻譜失真量測法之高運算複雜度的問題。為避免此高運算複雜度影響修正後之加權式對數頻譜失真量測法於實際應用中的使用,我們利用理論分析(theoretical analysis)的方法,推導出另一個計算量較簡單的二次加權式失真量測法,用以近似此修改後的加權式對數頻譜失真量測法。此外,由於在線頻譜頻率量化(LSF quantization) 的某些實際應用場合中,使用所近似的二次加權式失真量測法,其運算量仍然太過於複雜,所以我們進一步地再將所近似的二次加權式失真量測法化簡成“簡單二次加權式失真量測法”(simplified quadratically weighted distortion (SQWD) measure),而且這個簡單二次加權式失真量測法的計算量與加權式均平方誤差量測法相當。
最後,我們將含有誤差成形技術及頻譜動態訊息的線頻譜頻率量化器實際應用於低位元率(low bit rate)混合式激發源線預估(Mixed Excitation Linear Prediction, MELP)語音編碼器(speech coder)中,以測試所提出的誤差成形技術及頻譜動態訊息是否能提升線頻譜頻率量化器的品質(performance),並同時藉以觀察誤差成形技術及頻譜動態訊息對合成後語音(synthetic speech)聲音品質(quality)的影響。 In this thesis, the error shaping technique and the information of spectral dynamics between two successive frames of speech spectra are simultaneously incorporated into the LSF vector quantization (VQ) to improve the performance of LSF quantizers. The error shaping technique can be used to make better use of the perceptual property of human ear, and the spectral dynamics information incorporated into the LSF VQ can smooth the spectral quantization error so as to reduce the perceived distortion. The error shaping technique based on the weighted log-spectral distortion (WLSD) measure can be used to shape the spectral distortion distribution of quantization error into any frequency-dependent curve depending on what kind of weighting function is used. The WLSD measure is approximated to a quadratic distortion measure or the weighted mean squared error (WMSE) measure since the high computational complexity of the WLSD measure deters this error shaping technique from practical use. The optimal WMSE weights (i.e., the optimal weights of LSF parameters) also are determined based on the theoretical analysis of the WLSD measure in this error shaping technique. To incorporate the information of the spectral dynamics of LPC spectra into LSF VQ, an innovative technique is proposed. It is based on a modified weighted log-spectral distortion (MWLSD) measure. The MWLSD measure can be used to shape the spectral quantization distortion distribution into any frequency-dependent shaping curve, and simultaneously reduce the spectral-dynamics distortion between quantized spectra and unquantized spectra. That is, both the spectral distortion and spectral-dynamics distortion between quantized spectra and unquantized spectra can be taken into account simultaneously in designing a quantizer for a desired error shaping function by %using the proposed technique. using the MWLSD measure. In order to reduce the high computational complexity of the MWLSD measure during the search procedure in the LSF VQ, a quadratically weighted distortion (QWD) measure used to approximate the MWLSD measure is derived based on the theoretical analysis of the MWLSD measure. A simplified quadratically weighted distortion (SQWD) measure is also proposed to further reduce the computational complexity of the QWD measure for practical applications, whose computational complexity is almost equal to that of weighted mean square error (WMSE) measure. The error shaping technique and the spectral dynamics information are finally applied to the LSF quantization of CELP and MELP coders to test how it affects the overall speech quality in actual speech coding algorithms. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT890591004 http://hdl.handle.net/11536/67770 |
Appears in Collections: | Thesis |