Deep Neural Factorization for Speech Recognition

doi:10.21437/Interspeech.2017-892

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Chien, Jen-Tzung	en_US
dc.contributor.author	Shen, Chen	en_US
dc.date.accessioned	2019-04-02T06:04:20Z	-
dc.date.available	2019-04-02T06:04:20Z	-
dc.date.issued	2017-01-01	en_US
dc.identifier.issn	2308-457X	en_US
dc.identifier.uri	http://dx.doi.org/10.21437/Interspeech.2017-892	en_US
dc.identifier.uri	http://hdl.handle.net/11536/150997	-
dc.description.abstract	Conventional speech recognition system is constructed by unfolding the spectral-temporal input matrices into one-way vectors and using these vectors to estimate the affine parameters of neural network according to the vector-based error backpropagation algorithm. System performance is constrained because the contextual correlations in frequency and time horizons are disregarded and the spectral and temporal factors are excluded. This paper proposes a spectral-temporal factorized neural network (STFNN) to tackle this weakness. The spectral-temporal structure is preserved and factorized in hidden layers through two ways of factor matrices which are trained by using the factorized error backpropagation. Affine transformation in standard neural network is generalized to the spectro-temporal factorization in STFNN. The structural features or patterns are extracted and forwarded towards the softmax outputs. A deep neural factorization is built by cascading a number of factorization layers with fully-connected layers for speech recognition. An orthogonal constraint is imposed in factor matrices for redundancy reduction. Experimental results show the merit of integrating the factorized features in deep feedforward and recurrent neural networks for speech recognition.	en_US
dc.language.iso	en_US	en_US
dc.subject	Spectro-temporal factorization	en_US
dc.subject	deep neural network	en_US
dc.subject	factorized error backpropagation	en_US
dc.subject	speech recognition	en_US
dc.title	Deep Neural Factorization for Speech Recognition	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.doi	10.21437/Interspeech.2017-892	en_US
dc.identifier.journal	18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION	en_US
dc.citation.spage	3682	en_US
dc.citation.epage	3686	en_US
dc.contributor.department	電機工程學系	zh_TW
dc.contributor.department	Department of Electrical and Computer Engineering	en_US
dc.identifier.wosnumber	WOS:000457505000765	en_US
dc.citation.woscount	1	en_US
顯示於類別：	會議論文