2.4 Kbps 位元率語音編碼技術

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	林信安	en_US
dc.contributor.author	Lin, Hsin-An	en_US
dc.contributor.author	林進燈	en_US
dc.contributor.author	Lin Chin-Teng	en_US
dc.date.accessioned	2014-12-12T02:15:03Z	-
dc.date.available	2014-12-12T02:15:03Z	-
dc.date.issued	1995	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT840327068	en_US
dc.identifier.uri	http://hdl.handle.net/11536/60328	-
dc.description.abstract	廣為人知的 F.S.1016 CELP 4.8 Kbps技術不僅能產生低位元率的語音壓縮，而且能保持高音質的合成語音。然而，因為通訊頻道容量和貯存量被限制，所以在現今以較低位元率(低於4.8 Kbps)來表示語音訊號是重要的。傳統的線性預佑編碼vocoder( LPC vocoder) 能夠在2.4 Kbps產生可理解的語音，但是它們時常產生不自然的聲音，如嗡嗡聲、砰砰聲、與音調雜音。這些問題起源於每個音框用週期性脈衝列( periodic pulse train )，僅以一個位元來決定有聲或無聲，和不正確的增益評估。在這論文提出改良型 LPC vocoder，它是基於傳統LPC vocoder結構，在這個編碼器，為了產生更自然的合成語音，我們將使用非週期性脈衝( aperiodic pulse )，四分之一音框之有聲/無聲的決策( qarter voiced /nvoiced decision )，和基於包絡形狀( envelope shape )的增益評估。非週期性脈衝能減少在LPC 頻譜的尖峭端點所導致不自然聲音，四分之一音框之有聲/無聲的決策是把語音訊號的音框區分成四個次音框，再對每個次音框來決定有聲或無聲，增益評估是使用一個閉迴路分析合成法技術來執行，它使原始語音訊號的包絡形狀能與合成語音訊號的包絡形狀一致，來獲得更乾淨、更平滑的語音輸出。雖然改良型 LPC vocoder 的性能是可接受的，但仍然有雜音。因此，我們使用適應性後處理濾波器( adaptive postfilter )來改善合成語音的聽覺品質。此外，我們使用格子狀向量量化( Lattice Vector Quantization )技術的特性(儘可能使用較少位元而沒有減少語音品質)，來量化線頻譜對( Linear Spectrum Pair )參數，這個LVQ僅需要較少記憶體和低複雜度的計算量。平均意見分數( Mean Opinion Score )指出改良型 LPC vocoder 所實現的音質優於現存LPC-10版本。 The well-known F. S. 1016 CELP 4.8 kbps technique can not only produce low bit-rate compressed speech but also maintain high quality of synthetic speech. However, because the communication channel capacity and the storage is getting limited, it is important to represent the speech signal at lower bit-rate (less than 4.8 Kbps) nowadays. Traditional linear predictive coding (LPC) vocoders can produce intelligible speech at 2.4 kbps, but they often generate unnatural sounds such as buzzes, thumps, and tonal noises. These problems arise from a periodic pulse train, the voicing decision with only one bit, and the inaccurate gain estimation for every frame. This dissertation presents an improved LPC vocoder based on the traditional LPC vocoder structure. In this coder, we use the aperiodic pulse scheme, the quarter voiced/unvoiced decision, and the gain estimation based on envelope shape to reproduce more natural synthetic speech. The aperiodic pulse can reduce sharp spectral peaks in the LPC spectrum which may result in unnatural sounds. The quarter voiced/unvoiced decision scheme divides one frame of speech signal into four subframes, and each subframe is determined to be either voiced or unvoiced. The gain estimation is performed using a closed-loop analysis-by-synthesis technique, in which the envelope shapes of the original and synthetic speech signals are matched to obtain a cleaner and smoother speech output. Although the performance of the improved LPC vocoder at 2.4 Kbps is acceptable, it is still perceived to be rough or noisy. Hence, we use the adaptive postfilter to improve the perceptual quality of the synthetic speech. Moreover, we use the lattice vector quantization (LVQ) technique to quantize the Line Spectrum Pair (LSP) parameters by using as few bits as possible without reducing the speech quality. The LVQ requires smaller memory size as well as low computational complexity. The mean opinion score (MOS) shows that the improved LPC vocoder achieves the quality superior to that of the existing LPC-10 versions.	zh_TW
dc.language.iso	zh_TW	en_US
dc.subject	改良型線性預估編碼 Vocoder	zh_TW
dc.subject	非週期性脈衝	zh_TW
dc.subject	四分之一音框之有聲/無聲的決策	zh_TW
dc.subject	包封型狀	zh_TW
dc.subject	格子狀向量量化	zh_TW
dc.subject	Improved Linear Predictive Coding Vocoder	en_US
dc.subject	Aperiodic Pulse	en_US
dc.subject	Quarter Voiced/Unvoiced Decision	en_US
dc.subject	Envelope Shape	en_US
dc.subject	Lattice Vector Quantization	en_US
dc.title	2.4 Kbps 位元率語音編碼技術	zh_TW
dc.title	A 2.4 Kbps Bit Rate Speech Coding Technique	en_US
dc.type	Thesis	en_US
dc.contributor.department	電控工程研究所	zh_TW
顯示於類別：	畢業論文