標題: A GENERATIVE AUDITORY MODEL EMBEDDED NEURAL NETWORK FOR SPEECH PROCESSING
作者: Lo, Yu-Wen
Shen, Yih-Liang
Liao, Yuan-Fu
Chi, Tai-Shih
電機工程學系
Department of Electrical and Computer Engineering
關鍵字: generative auditory model;convolutional neural network;multi-resolution;speaker identification
公開日期: 1-一月-2018
摘要: Before the era of the neural network (NN), features extracted from auditory models have been applied to various speech applications and been demonstrated more robust against noise than conventional speech-processing features. What's the role of auditory models in the current NN era? Are they obsolete? To answer this question, we construct a NN with a generative auditory model embedded to process speech signals. The generative auditory model consists of two stages, the stage of spectrum estimation in the logarithmic-frequency axis by the cochlea and the stage of spectral-temporal analysis in the modulation domain by the auditory cortex. The NN is evaluated in a simple speaker identification task. Experiment results show that the auditory model embedded NN is still more robust against noise, especially in low SNR conditions, than the randomly-initialized NN in speaker identification.
URI: http://hdl.handle.net/11536/150765
期刊: 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
起始頁: 5179
結束頁: 5183
顯示於類別:會議論文