完整後設資料紀錄
DC 欄位語言
dc.contributor.authorTu, Youzhien_US
dc.contributor.authorMak, Man-Waien_US
dc.contributor.authorChien, Jen-Tzungen_US
dc.date.accessioned2020-10-05T01:59:45Z-
dc.date.available2020-10-05T01:59:45Z-
dc.date.issued2020-01-01en_US
dc.identifier.issn2329-9290en_US
dc.identifier.urihttp://dx.doi.org/10.1109/TASLP.2020.3004760en_US
dc.identifier.urihttp://hdl.handle.net/11536/154890-
dc.description.abstractDomain mismatch is a common problem in speaker verification (SV) and often causes performance degradation. For the system relying on the Gaussian PLDA backend to suppress the channel variability, the performance would be further limited if there is no Gaussianity constraint on the learned embeddings. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) that incorporates an InfoVAE into domain adversarial training (DAT) to reduce domain mismatch and simultaneously meet the Gaussianity requirement of the PLDA backend. Specifically, DAT is applied to produce speaker discriminative and domain-invariant features, while the InfoVAE performs variational regularization on the embedded features so that they follow a Gaussian distribution. Another benefit of the InfoVAE is that it avoids posterior collapse in VAEs by preserving the mutual information between the embedded features and the training set so that extra speaker information can be retained in the features. Experiments on both SRE16 and SRE18-CMN2 show that the InfoVDANN outperforms the recent VDANN, which suggests that increasing the mutual information between the embedded features and input features enables the InfoVDANN to extract extra speaker information that is otherwise not possible.en_US
dc.language.isoen_USen_US
dc.subjectSpeaker verification (SV)en_US
dc.subjectdomain adaptationen_US
dc.subjectdomain adversarial trainingen_US
dc.subjectvariational autoencoderen_US
dc.subjectmutual informationen_US
dc.titleVariational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verificationen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/TASLP.2020.3004760en_US
dc.identifier.journalIEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSINGen_US
dc.citation.volume28en_US
dc.citation.spage2013en_US
dc.citation.epage2024en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000548750200002en_US
dc.citation.woscount0en_US
顯示於類別:期刊論文