Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hwang, Hsin-Te | en_US |
dc.contributor.author | Tsao, Yu | en_US |
dc.contributor.author | Wang, Hsin-Min | en_US |
dc.contributor.author | Wang, Yih-Ru | en_US |
dc.contributor.author | Chen, Sin-Horng | en_US |
dc.date.accessioned | 2017-04-21T06:49:29Z | - |
dc.date.available | 2017-04-21T06:49:29Z | - |
dc.date.issued | 2015 | en_US |
dc.identifier.isbn | 978-988-14768-0-7 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/136226 | - |
dc.description.abstract | Voice conversion (VC) using artificial neural networks (ANNs) has shown its capability to produce better sound quality of the converted speech than that using Gaussian mixture model (GMM). Although ANN-based VC works reasonably well, there is still room for further improvement. One of the promising ways is to adopt the successful techniques in statistical model-based parameter generation (SMPG), such as trajectory-based mapping approaches that are originally designed for GMM-based VC and hidden Markov model (HMM)-based speech synthesis. This study presents a probabilistic interpretation for ANN-based VC. In this way, ANN-based VC can easily incorporate the successful techniques in SMPG. Experimental results demonstrate that the performance of ANN-based VC can be effectively improved by two trajectory-based mapping techniques (maximum likelihood parameter generation (MLPG) algorithm and maximum likelihood-based trajectory mapping considering global variance (referred to as MLGV)), compared to the conventional ANN-based VC with frame-based mapping and the GMM-based VC with the MLPG algorithm. Moreover, ANN-based VC with the trajectory-based mapping techniques can achieve comparable performance when compared to the state-of-the-art GMM-based VC with the MLGV algorithm. | en_US |
dc.language.iso | en_US | en_US |
dc.title | A Probabilistic Interpretation for Artificial Neural Network-based Voice Conversion | en_US |
dc.type | Proceedings Paper | en_US |
dc.identifier.journal | 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA) | en_US |
dc.citation.spage | 552 | en_US |
dc.citation.epage | 558 | en_US |
dc.contributor.department | 電機學院 | zh_TW |
dc.contributor.department | College of Electrical and Computer Engineering | en_US |
dc.identifier.wosnumber | WOS:000382954100106 | en_US |
dc.citation.woscount | 0 | en_US |
Appears in Collections: | Conferences Paper |