Word boundary detection with mel-scale frequency bank in noisy environment

doi:10.1109/89.861373

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Wu, GD	en_US
dc.contributor.author	Lin, CT	en_US
dc.date.accessioned	2014-12-08T15:44:50Z	-
dc.date.available	2014-12-08T15:44:50Z	-
dc.date.issued	2000-09-01	en_US
dc.identifier.issn	1063-6676	en_US
dc.identifier.uri	http://dx.doi.org/10.1109/89.861373	en_US
dc.identifier.uri	http://hdl.handle.net/11536/30261	-
dc.description.abstract	This paper addresses the problem of automatic word boundary detection in the presence of noise. We first propose an adaptive time-frequency (ATF) parameter for extracting both the time and frequency features of noisy speech signals. The ATF parameter extends the TF parameter proposed by Junqua et al. from Single band to multiband spectrum analysis, where the frequency bands help to make the distinction of speech and noise signals clear. The ATF parameter can extract useful frequency information by adaptively choosing proper bands of the mel-scale frequency bank. The ATF parameter increased the recognition rate by about 3% of a TF-based robust algorithm which has been shown to outperform several commonly used algorithms for word boundary detection in the presence of noise. The ATF parameter also reduced the recognition error rate due to endpoint detection to about 20%. Based on the ATF parameter, we further propose a new word boundary detection algorithm by using a neural fuzzy network (called SONFIN) for identifying islands of word signals in noisy environment. Due to the self-learning ability of SONFIN, the proposed algorithm avoids the need of empirically determining thresholds and ambiguous rules in normal word boundary detection algorithms. As compared to normal neural networks, the SONFIN can always find itself an economic network size in high learning speed. Our results also showed that the SONFIN's performance is not significantly affected by the size of training set. The ATF-based SONFIN achieved higher recognition rate than the TF-based robust algorithm by about 5%. It also reduced the recognition error rate due to endpoint detection to about 10%, compared to an average of approximately 30% obtained with the TF-based robust algorithm, and 50% obtained with the modified version of the Lamel ct al. algorithm.	en_US
dc.language.iso	en_US	en_US
dc.subject	mel-scale frequency	en_US
dc.subject	multiband	en_US
dc.subject	neural fuzzy network	en_US
dc.subject	self-learning ability	en_US
dc.subject	spectrum analysis	en_US
dc.title	Word boundary detection with mel-scale frequency bank in noisy environment	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1109/89.861373	en_US
dc.identifier.journal	IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING	en_US
dc.citation.volume	8	en_US
dc.citation.issue	5	en_US
dc.citation.spage	541	en_US
dc.citation.epage	554	en_US
dc.contributor.department	電控工程研究所	zh_TW
dc.contributor.department	Institute of Electrical and Control Engineering	en_US
dc.identifier.wosnumber	WOS:000088724800005	-
dc.citation.woscount	34	-
顯示於類別：	期刊論文

文件中的檔案：

000088724800005.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。