標題: | A recurrent neural fuzzy network for word boundary detection in variable noise-level environments |
作者: | Wu, GD Lin, CT 電控工程研究所 Institute of Electrical and Control Engineering |
關鍵字: | cepstrum;linear prediction coefficient (LPC);mel-scale filter bank;recurrent network;space partition;time-frequency (TF) |
公開日期: | 1-二月-2001 |
摘要: | This paper discusses the problem of automatic word boundary detection in the presence of variable-level background noise. Commonly used robust word boundary detection algorithms always assume that the background noise level is fixed. In fact, the background noise level mag vary during the procedure of recording. This is the major reason that most robust word boundary detection algorithms cannot work well in the condition of variable background noise level, In order to solve this problem, we first propose a refined time-frequency (RTF) parameter for extracting both the time and frequency features of noisy speech signals. The RTF parameter extends the (time-frequency) TF parameter proposed by Junqua et al, from single band to multiband spectrum analysis, where the frequency bands help to make the distinction between speech signal and noise clear. The RTF parameter can extract useful frequency information, Based on this RTF parameter, we further propose a new word boundary detection algorithm by using a recurrent sell-organizing neural fuzzy inference network (RSONFIN). Since RSONFIN can process the temporal relations, the proposed RTF-based RSONFIN algorithm can find the variation of the background noise level and detect correct word boundaries in the condition of variable background noise level. As compared to normal neural networks, the RSONFIN can always find itself an economic network size with high-learning speed, Due to the self-learning ability of RSONFIN, this RTF-based RSONFIN algorithm avoids the need for empirically determining ambiguous decision rules in normal word boundary detection algorithms. Experimental results show that this new algorithm achieves higher recognition rate than the TF-based algorithm which has been shown to outperform several commonly used word boundary detection algorithms by about 12% in variable background noise level condition. It also reduces the recognition error rate due to endpoint detection to about 23%, compared to an average of 47% obtained by the TF-based algorithm in the same condition. |
URI: | http://dx.doi.org/10.1109/3477.907566 http://hdl.handle.net/11536/29875 |
ISSN: | 1083-4419 |
DOI: | 10.1109/3477.907566 |
期刊: | IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS |
Volume: | 31 |
Issue: | 1 |
起始頁: | 84 |
結束頁: | 97 |
顯示於類別: | 期刊論文 |