Full metadata record
DC FieldValueLanguage
dc.contributor.authorLin, Cheng-Hsienen_US
dc.contributor.authorYou, Chung-Longen_US
dc.contributor.authorChiang, Chen-Yuen_US
dc.contributor.authorWang, Yih-Ruen_US
dc.contributor.authorChen, Sin-Horngen_US
dc.date.accessioned2019-06-03T01:08:34Z-
dc.date.available2019-06-03T01:08:34Z-
dc.date.issued2019-04-01en_US
dc.identifier.issn0001-4966en_US
dc.identifier.urihttp://dx.doi.org/10.1121/1.5099263en_US
dc.identifier.urihttp://hdl.handle.net/11536/151945-
dc.description.abstractIn this paper, a hierarchical prosody model (HPM)-based method for Mandarin spontaneous speech is proposed. First, an HPM is designed for describing relations among acoustic features of utterances, linguistic features of texts, and prosodic tags representing the underlying hierarchical prosodic structures of utterances. Subsequently, a sequential optimization algorithm is employed to train the HPM based on a large conversational speech corpus, the Mandarin Conversational Dialogue Corpus (MCDC), which features orthographic transcriptions and prosodic event annotations. In this unsupervised training method, all utterances of the MCDC are labeled with two types of prosodic tags, namely, break and prosodic states, automatically and simultaneously. After training, the HPM parameters are examined to identify critical prosodic properties of Mandarin spontaneous speech, which are then compared with their counterparts in the read-speech HPM. The prosodic tags on the studied utterances enable mapping of various prosodic events onto the hierarchical prosodic structures of the utterances. Prosodic analyses of some disfluent events are conducted using the prosodic tags affixed to the MCDC. Finally, an application of the HPM to assist in Mandarin spontaneous-speech recognition is discussed. Significant relative error rate reductions of 9.0%, 9.2%, 15.6%, and 7.3% are obtained for base-syllable, character, tone, and word recognition, respectively. (C) 2019 Acoustical Society of America.en_US
dc.language.isoen_USen_US
dc.titleHierarchical prosody modeling for Mandarin spontaneous speechen_US
dc.typeArticleen_US
dc.identifier.doi10.1121/1.5099263en_US
dc.identifier.journalJOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICAen_US
dc.citation.volume145en_US
dc.citation.issue4en_US
dc.citation.spage2576en_US
dc.citation.epage2596en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000466779100066en_US
dc.citation.woscount0en_US
Appears in Collections:Articles