Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Chiu, Tzu-Hsuan | en_US |
dc.contributor.author | Chiang, Chen-Yu | en_US |
dc.contributor.author | Liao, Yuan-Fu | en_US |
dc.contributor.author | Yang, Jyh-Her | en_US |
dc.contributor.author | Wang, Yih-Ru | en_US |
dc.contributor.author | Chen, Sin-Horng | en_US |
dc.date.accessioned | 2014-12-08T15:33:16Z | - |
dc.date.available | 2014-12-08T15:33:16Z | - |
dc.date.issued | 2012 | en_US |
dc.identifier.isbn | 978-7-5608-4869-3 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/23150 | - |
dc.description.abstract | A study on introducing prosodic information to acoustic modeling (AM) for speech recognition is reported in this paper. It extends the conventional context-dependent (CD) triphone HMM modeling approach to further consider the dependency of phone model on the break type of nearby inter-syllable boundary. Four break types are considered, including major break, minor break, normal non-break, and tightly-coupled non-break. In the training phase, break labeling is automatically accomplished by a Prosody Labeling and Modeling algorithm proposed previously. Then, prosody-and phonetic-dependent phone models are constructed by a standard decision tree-based context clustering of HMMs. The effectiveness of the new AM was examined on a Mandarin syllable recognition task. Experimental results showed that the new approach outperformed the conventional CD-AM on achieving better syllable recognition rate as well as on obtaining a more efficient syllable lattice with better compromise on complexity verse syllable coverage rate. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | acoustic modeling | en_US |
dc.subject | speech recognition | en_US |
dc.subject | prosody-dependent acoustic model | en_US |
dc.subject | prosodic break | en_US |
dc.title | Prosody-dependent Acoustic Modeling for Mandarin Speech Recognition | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.journal | PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II | en_US |
dc.citation.spage | 139 | en_US |
dc.citation.epage | 142 | en_US |
dc.contributor.department | 電機工程學系 | zh_TW |
dc.contributor.department | Department of Electrical and Computer Engineering | en_US |
dc.identifier.wosnumber | WOS:000325160200035 | - |
Appears in Collections: | Conferences Paper |