Hybrid Accelerated Optimization for Speech Recognition

doi:10.21437/Interspeech.2016-192

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chien, Jen-Tzung	en_US
dc.contributor.author	Huang, Pei-Wen	en_US
dc.contributor.author	Lee, Tan	en_US
dc.date.accessioned	2018-08-21T05:56:52Z	-
dc.date.available	2018-08-21T05:56:52Z	-
dc.date.issued	2016-01-01	en_US
dc.identifier.issn	2308-457X	en_US
dc.identifier.uri	http://dx.doi.org/10.21437/Interspeech.2016-192	en_US
dc.identifier.uri	http://hdl.handle.net/11536/146774	-
dc.description.abstract	Optimization procedure is crucial to achieve desirable performance for speech recognition based on deep neural networks (DNNs). Conventionally, DNNs are trained by using mini-batch stochastic gradient descent (SGD) which is stable but prone to be trapped into local optimum. A recent work based on Nesterov's accelerated gradient descent (NAG) algorithm is developed by merging the current momentum information into correction of SGD updating. NAG less likely jumps into local minimum so that convergence rate is improved. In general, optimization based on SGD is more stable while that based on NAG is faster and more accurate. This study aims to boost the performance of speech recognition by combining complimentary SGD and NAG. A new hybrid optimization is proposed by integrating the SGD with momentum and the NAG by using an interpolation scheme which is continuously run in each mini-batch according to the change rate of cost function in consecutive two learning epochs. Tradeoff between two algorithms can be balanced for mini-batch optimization. Experiments on speech recognition using CUSENT and Aurora-4 show the effectiveness of the hybrid accelerated optimization in DNN acoustic model.	en_US
dc.language.iso	en_US	en_US
dc.subject	hybrid optimization	en_US
dc.subject	stochastic gradient descent	en_US
dc.subject	deep neural network	en_US
dc.subject	speech recognition	en_US
dc.title	Hybrid Accelerated Optimization for Speech Recognition	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.doi	10.21437/Interspeech.2016-192	en_US
dc.identifier.journal	17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES	en_US
dc.citation.spage	3399	en_US
dc.citation.epage	3403	en_US
dc.contributor.department	電機工程學系	zh_TW
dc.contributor.department	Department of Electrical and Computer Engineering	en_US
dc.identifier.wosnumber	WOS:000409394402061	en_US
Appears in Collections:	Conferences Paper