Full metadata record
DC FieldValueLanguage
dc.contributor.authorHwang, Hsin-Teen_US
dc.contributor.authorWu, Yi-Chiaoen_US
dc.contributor.authorPeng, Yu-Huaien_US
dc.contributor.authorHsu, Chin-Chengen_US
dc.contributor.authorTsao, Yuen_US
dc.contributor.authorWang, Hsin-Minen_US
dc.contributor.authorWang, Yih-Ruen_US
dc.contributor.authorChen, Sin-Horngen_US
dc.date.accessioned2019-04-02T05:59:05Z-
dc.date.available2019-04-02T05:59:05Z-
dc.date.issued2018-11-01en_US
dc.identifier.issn1016-2364en_US
dc.identifier.urihttp://dx.doi.org/10.6688/JISE.201811_34(6).0008en_US
dc.identifier.urihttp://hdl.handle.net/11536/148520-
dc.description.abstractThis paper presents a novel locally linear embedding (LLE)-based framework for exemplar -based spectral conversion (SC). The key feature of the proposed SC framework is that it integrates the LLE algorithm, a manifold learning method, with the conventional exemplar -based SC method. One important advantage of the LLE-based SC framework is that it can be applied to either one-to-one SC or many-to-one SC. For one-to-one SC, a parallel speech corpus consisting of the pre-specified source and target speakers' speeches is used to construct the paired source and target dictionaries in advance. During online conversion, the LLE-based SC method converts the source spectral features to the target like spectral features based on the paired dictionaries. On the other hand, when applied to many-to-one SC, our system is capable of converting the voice of any unseen source speaker to that of a desired target speaker, without the requirement of collecting parallel training speech utterances from them beforehand. To further improve the quality of the converted speech, the maximum likelihood parameter generation (MLPG) and global variance (GV) methods are adopted in the proposed SC systems. Experimental results demonstrate that the proposed one-to-one SC system is comparable with the state-of-the-art Gaussian mixture model (GMM)-based one-to-one SC system in terms of speech quality and speaker similarity, and the many-to-one SC system can approximate the performance of the one-to-one SC system.en_US
dc.language.isoen_USen_US
dc.subjectvoice conversionen_US
dc.subjectlocally linear embeddingen_US
dc.subjectexemplar-baseden_US
dc.subjectmany-to-oneen_US
dc.subjectmanifold learningen_US
dc.titleVoice Conversion Based on Locally Linear Embeddingen_US
dc.typeArticleen_US
dc.identifier.doi10.6688/JISE.201811_34(6).0008en_US
dc.identifier.journalJOURNAL OF INFORMATION SCIENCE AND ENGINEERINGen_US
dc.citation.volume34en_US
dc.citation.spage1493en_US
dc.citation.epage1516en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000451364100008en_US
dc.citation.woscount0en_US
Appears in Collections:Articles