標題: 聽覺模型在強健性正弦編碼處理之應用
Perceptual Enhancement of Sinusoidal Transform Coding for Noisy Channels
作者: 王德譽
De-Yu Wang
張文輝
Wen-Whei Chang
電信工程研究所
關鍵字: 正弦轉換編碼;巴克頻譜;知覺線性預估模型;非因果全極相位模型;強健性向量量化;索引指定;基因法則;隱藏式馬可夫模型;Sinusoidal Transform Coding (STC);Bark spectrum;perceptual linear prediction (PLP);noncausal all-pole vocal system;Robust VQ;index assignment;genetic algorithm (GA);Hidden Mokov Model (HMM)
公開日期: 1998
摘要: 本論文之目的旨在利用正弦激勵模式配合聲響心理學相關知識,開發一種能兼顧高音質與高壓縮比的2.4kb/s語音編碼技術,並針對巴克頻譜參數進行強健性向量量化設計以對抗通道雜訊。在語音壓縮部分,我們參考人耳在不同頻率與不同響度下之非線性響度聽覺響應,利用巴克頻譜進行正弦諧波振幅參數編碼。至於解碼部分,由於諧波個數會隨著音高變化,諧波振幅與巴克頻譜間並不存在唯一對映關係。為解決此問題,我們以知覺線性預估模型建立頻譜包絡線並由巴克頻譜求得其模型參數,進而還原取得諧波振幅。為加強相位的精確預估,本論文也提出非因果全極相位模型,經實驗證實能更有效模擬聲道相位。有關巴克頻譜的強健性向量量化設計,碼書的索引指定可有效的改善通道雜訊的干擾現象,但過去研究皆只局限在非記憶性二元對稱通道。本論文首先建立有限狀態馬可夫隨機過程之記憶性通道模型在碼書索引指定問題上的數學推導,再參考基因法則建立一適用於函數最佳化的隨機搜尋演算機制,以供雜訊干擾時設計通道錯誤模型與向量量化碼書索引指定之用。實驗結果顯示提出的正弦波參數模型可以改善合成聲音品質,而且其巴克頻譜參數碼書可以有效對抗通道雜訊。
This study focuses on two issues: perceptual enhancement of sinusoidal transform coding (STC) and optimal index assignment of vector quantization (VQ), to design a 2.4 kb/s speech coder that achieves high robustness against channel errors. STC attempts to model speech waveform as the sum of sinusoids whose frequencies, amplitudes, and phases are chosen to make the reconstruction a best fit to the original speech. The first part of this study focuses on quality enhancement of STC via the development of new parametric models. The benefits of the Bark spectrum are explored for use in the design of perceptual coding of the sine-wave amplitudes. In comparison to existing STC based on cepstral representation, the Bark-based amplitude coder is preferred because of its ability to achieve a uniform perceptual fit across the spectrum. One enhancement that further improves phase accuracy is the use of a noncausal all-pole vocal system that better matched the maximum-phase nature of differentiated glottal pulses. The next step of the present investigation was concerned with transmission of vector-quantized Bark spectrum over a noisy channel. We formulated the channel-robust VQ design as a combinatorial optimization problem leading to a search for the minimum distortion index assignment. To better track the statistical dependencies between error sequences, we propose to incorporate Markov characterization of the channel into the VQ design. Simulation results indicate that the global explorative properties of genetic algorithms make them very effective at finding the optimal index assignment and by using this index assignment the vector quantizer can be developed to respond to various channel conditions.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT870435096
http://hdl.handle.net/11536/64554
Appears in Collections:Thesis