Full metadata record
DC FieldValueLanguage
dc.contributor.authorChien, Jen-Tzungen_US
dc.contributor.authorShen, Chenen_US
dc.date.accessioned2019-04-02T06:04:20Z-
dc.date.available2019-04-02T06:04:20Z-
dc.date.issued2017-01-01en_US
dc.identifier.issn2308-457Xen_US
dc.identifier.urihttp://dx.doi.org/10.21437/Interspeech.2017-892en_US
dc.identifier.urihttp://hdl.handle.net/11536/150997-
dc.description.abstractConventional speech recognition system is constructed by unfolding the spectral-temporal input matrices into one-way vectors and using these vectors to estimate the affine parameters of neural network according to the vector-based error backpropagation algorithm. System performance is constrained because the contextual correlations in frequency and time horizons are disregarded and the spectral and temporal factors are excluded. This paper proposes a spectral-temporal factorized neural network (STFNN) to tackle this weakness. The spectral-temporal structure is preserved and factorized in hidden layers through two ways of factor matrices which are trained by using the factorized error backpropagation. Affine transformation in standard neural network is generalized to the spectro-temporal factorization in STFNN. The structural features or patterns are extracted and forwarded towards the softmax outputs. A deep neural factorization is built by cascading a number of factorization layers with fully-connected layers for speech recognition. An orthogonal constraint is imposed in factor matrices for redundancy reduction. Experimental results show the merit of integrating the factorized features in deep feedforward and recurrent neural networks for speech recognition.en_US
dc.language.isoen_USen_US
dc.subjectSpectro-temporal factorizationen_US
dc.subjectdeep neural networken_US
dc.subjectfactorized error backpropagationen_US
dc.subjectspeech recognitionen_US
dc.titleDeep Neural Factorization for Speech Recognitionen_US
dc.typeProceedings Paperen_US
dc.identifier.doi10.21437/Interspeech.2017-892en_US
dc.identifier.journal18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTIONen_US
dc.citation.spage3682en_US
dc.citation.epage3686en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000457505000765en_US
dc.citation.woscount1en_US
Appears in Collections:Conferences Paper