標題: | 適用於華語數位助聽器之仿神經音高式噪音消除設計與實現 Design and Implementation of Neuromorphic Pitch Based Noise Reduction for Mandarin Digital Hearing Aid System |
作者: | 陳育瑞 Chen, Yu-Jui 周世傑 Jou, Shyh-Jye 電子研究所 |
關鍵字: | 雜訊消除;音高;仿神經;助聽器;Noise Reduction;Pitch;Neuromorphic;Hearing Aid |
公開日期: | 2011 |
摘要: | 在助聽器系統中,訊號通常都會經過放大後輸出以補償患者受損的聽力區域。然而只單純增加訊號的增益並不能完全解決聽損的問題,因為過度放大的聲音可能會傷害到患者剩餘的聽力,此外由於背景噪音也被等量放大,導致患者對語音的辨識度也無法有效的提升(特別在低信噪比的噪音環境下)。因此在助聽器中必須設計噪音消除系統,在盡可能不造成語音失真的前提下消除背景的噪音。另一方面,由於助聽器是可攜式且小體積的裝置,因此當噪音消除系統實現在硬體架構上時,必須考慮使用低功耗、低複雜度的設計以減少功率消耗及晶片體積。
在這篇論文中,我們提出一套適用於以ANSI S1.11 濾波器組基礎之華語助聽器系統的仿神經音高基準噪音消除系統。此系統包含以音高為基準的語音偵測器及仿聽覺神經系統的噪音衰減系統。音高基準之語音偵測器在穩定或快速變動的噪音環境中皆可準確地偵測出語音區間。另外依據華語特性設計音素保持器以增加VAD的準確性並減少功率消耗。仿聽神經噪音衰減器則透過衰減噪音為主要訊號的子頻帶,保留語音為主要訊號的子頻帶來提升區段訊雜比及語音品質。每個子頻帶的衰減增益是利用VAD中非線性能量運算訊雜比與子音起始偵測器的結果來決定。另外基於低功耗硬體實現考量,我們也針對演算法與運算複雜度做最佳化。模擬結果顯示經由提出之系統處理後,在低語音訊雜比的穩定噪音環境下語音區段訊雜比可提升4.238 dB,而在動態噪音環境下則平均可提升4.943 dB。另外PESQ在高語音訊雜比的穩定噪音環境下平均可提升0.210,而在動態噪音環境下則平均可提升0.216。
基於考量到硬體設計彈性,本系統以處理器為本之系統來實現助聽器系統。噪音消除系統之處理器架構包含一個5級多工處理器及一個專門處理複雜運算之硬體加速器。本系統硬體實現是使用台積電65奈米CMOS製程製造,整體雜訊消除系統若包含為了取代記憶體而使用的暫存器則共使用約120,829個邏輯閘,若不包含靜態隨機存取記憶體則使用約19,663個邏輯閘。此部分之暫存器共使用 12千位元。助聽器系統工作頻率為10百萬赫茲,而整體雜訊消除系統在工作頻率為2.5百萬赫茲之功耗在操作電壓1伏特為563.85微瓦,在0.5伏特則估計約54.88微瓦。 In hearing aid (HA) system, the audio signal is usually amplified to compensate the hearing loss of patient. However, only amplifying the audio signal cannot solve the problem. Because the large sound may damage the residual hearing ability of patient, and the intelligibility may not be improved effectively (especially in low SNR environment) since the background noise is amplified as well. Thus a noise reduction (NR) system is necessary in HAs in order to reduce the background noise while keeping the speech distortion as small as possible. Also, HAs is a portable and size limited device, therefore the noise reduction system must be low complexity and low power to extend the battery life and reduce the chip area. In this thesis, we propose a neuromorphic pitch based noise reduction system for ANSI S1.11 filter-bank based Mandarin hearing aid system. The proposed noise reduction system includes pitch based voice activity detector (pitch based VAD) and neuromorphic noise attenuator (NNA). The pitch based VAD can detect speech interval accurately when the background noise environment is stationary (to simulate the situation that user is still) or even highly dynamic (to simulate the situation that user is moving). According to the characteristic of Mandarin, a phoneme keeper (PK) improves the accuracy and reduces the power consumption of VAD. The neuromorphic noise attenuation improves the segmental signal-noise-ratio (SNRseg) and sound quality (PESQ) of speech by attenuating noise dominated subbands and reserving speech dominated subbands. The gains applied to each subband are decided by the signal-noise-ration-nonlinear-energy-operation (SNRNEO, detecting those subbands with high energy) in VAD and the consonant onset detector. We also optimize the algorithm, reducing the computational complexity for low power hardware implementation. The simulation results of the proposed system show that the average SNRseg improvement is 4.238 dB in low SNRseg (0~4dB) stationary background noise environment and 4.943 dB in dynamic noise situation. The PESQ improvement is 0.210 in high SNRseg (5~10dB) background noise situation and 0.216 in dynamic noise environment. The proposed design is implemented in a process element based (PE) hearing aids system for higher design flexibility. The PE of the proposed NR system includes a 5 stage pipeline processor and an accelerator for computational intensity operations. The implementation is fabricated on TSMC 65nm GP CMOS technology with regular VT cell library. The total gate count of the proposed noise reduction system is 19,663 excluding memory and 120,829 including memory. The usage of memory implemented with registers is 12 K bits. The clock rate of hearing aid system is 10 MHz. And the total power estimation of proposed NR implementation in 2.5 MHz clock rate is 54.88 μW when VDD =0.5V and 563.85 μW when VDD=1V. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079811624 http://hdl.handle.net/11536/46789 |
Appears in Collections: | Thesis |