標題: 適用於數位助聽器之快速傅立葉轉換系統的音高式噪音消除與回授消除技術設計
FFT Based Noise Reduction and Feedback Cancellation with Pitch Based Voice Activity Detector for Digital Hearing Aid System
作者: 林禹文
Lin, Yu-Wen
周世傑
Jou, Shyh-Jye
電子工程學系 電子研究所
關鍵字: 助聽器;雜訊消除;回授消除;語音區間偵測;音高;快速傅立葉;hearing aids;noise reduction;feedback cancellation;voice activity detection;pitch;fast Fourier transform
公開日期: 2014
摘要: 隨著製程以及訊號處理方面的進步之下,數位助聽器已成為現在助聽器主流。然而,由於目前助聽器仍受限於電池容量太小,要實現複雜的演算法是很困難的。而到目前為止最有效的解決方式是透過有效率的低功耗演算法,架構以及電路設計去完成。在當前的助聽器系統中有兩個主要的問題。第一個問題是在嘈雜的環境下,我們對語音的理解度會下降。第二個問題是回授音的問題。為了解決這兩個問題,我們提出了一個快速傅立葉轉換的噪音消除以及回授消除的演算法。 我們所提出的音高式噪音消除演算法包含了音高式語音偵測器以及雜訊抑制器。音高式語音偵測器是利用語音裡音高以及相對應的和諧音和子音起始的特性去偵測語音。為了更進一步改良我們的準確度,我們把兩種方法結合來偵測音高以及相對應的和諧音。此外為了改善雜訊抑制器的效能,我們把原本的適用於QuasiANSI濾波器組的雜訊消除演算法修改成適用於快速傅立葉轉換上,再加上兩條曲線八個等級的分配增益機制,可有效改善PESQ。我們提出的音高式語音偵測器平均準確度可以分別在靜態與動態背景雜訊環境裡達到79.99%與80.31%。而提出的雜訊抑制器的語音區段訊雜比和語音訊雜比在靜態背景雜訊環境中,平均改進6.09dB和8.86dB,在動態背景雜訊環境裡平均改進6.49dB和9.28dB。另外,語音品質(PESQ)在靜態與動態背景雜訊環境裡平均改進0.31和0.46。 在回授消除部分也是根據快速傅立葉轉換結果去設計,並設計出一個語音共振峰預估的方法。這個方法是利用音高的資訊去預估語音能量的分布,並用來輔助可適性回授消除演算法的係數更新,維持穩定的助聽器增益及音質。我們所提出的可適性回授消除濾波器設計可達到跟預估錯誤方法的可適性回授消除濾波器有相似的可增加的穩定增益以及聲音品質,而複雜度卻可以比他們少四個數量級。
With the advanced technology and signal processing, digital hearing aids have been the main trend of hearing aids. However, it is difficult to implement a complicated algorithm due to the limitation of battery size and capacity. The effective way of solving this problem is to design an efficient low power algorithm, architecture and circuit. And there are two main problems in nowadays hearing aids system. The first problem is that the intelligibility may be degraded due to the background noise. The second problem is the echo from the speaker. In order to solve these two problems, we propose an FFT based noise reduction and feedback cancellation. The proposed pitch based noise reduction includes the pitch based voice activity detection and noise reduction algorithm. The pitch based VAD utilizes the pitch and its harmonics and onset characteristics of speech to detect speech activity. In order to improve the VAD accuracy, two kinds of methods are combined for pitch and harmonic detection. Besides, in order to improve the performance of NR, we modify the original pitch based NR algorithm applicable to Quasi-ANSI filter bank [1] to be more effective for noise reduction and it is applicable to FFT based decomposition. We add some mechanisms, like two curves eight levels gain assignment, to improve the PESQ. The accuracy rate of proposed pitch based VAD can achieve 79.99% and 80.31% in stationary and non-stationary noise environment respectively. And the average improvement of segmental signal-noise-ratio (SNRseg) and signal-noise-ratio (SNR) of the proposed noise reduction is 6.09dB and 8.86dB in stationary noise environment and 6.49dB and 9.28dB in non-stationary noise environment. Moreover, the average improvement of sound quality (PESQ) is 0.31 and 0.46 in stationary and non-stationary noise environments respectively. The design of feedback cancellation is based on FFT decomposition and a decorrelation filter coefficient update mechanism is proposed. The decorrelation filter coefficient update utilizes the pitch information to estimate speech formant to enhance the robustness and the sound quality of adaptive feedback cancellation (AFC). The proposed AFC design can achieve similar added stable gain (ASG) and PESQ but with four orders complexity reduction compared to PEM-AFC design.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070150199
http://hdl.handle.net/11536/75820
顯示於類別:畢業論文