Full metadata record
DC FieldValueLanguage
dc.contributor.authorChien, Jen-Tzungen_US
dc.contributor.authorKuo, Chun-Linen_US
dc.date.accessioned2020-07-01T05:21:48Z-
dc.date.available2020-07-01T05:21:48Z-
dc.date.issued2019-01-01en_US
dc.identifier.isbn978-1-7281-0306-8en_US
dc.identifier.urihttp://hdl.handle.net/11536/154481-
dc.description.abstractThis paper presents a new generative adversarial network (GAN) which artificially generates the i-vectors to compensate the imbalanced or insufficient data in speaker recognition based on the probabilistic linear discriminant analysis. Theoretically, GAN is powerful to generate the artificial data which are misclassified as the real data. However, GAN suffers from the mode collapse problem in two-player optimization over generator and discriminator. This study deals with this challenge by improving the model regularization through characterizing the weight uncertainty in GAN. A new Bayesian GAN is implemented to learn a regularized model from diverse data where the strong modes are flattened via the marginalization. In particular, we present a variational GAN (VGAN) where the encoder, generator and discriminator are jointly estimated according to the variational inference. The computation cost is significantly reduced. To assure the preservation of gradient values, the learning objective based on Wasserstein distance is further introduced. The issues of model collapse and gradient vanishing are alleviated. Experiments on NIST i-vector Speaker Recognition Challenge demonstrate the superiority of the proposed VGAN to the variational autoencoder, the standard GAN and the Bayesian GAN based on the sampling method. The learning efficiency and generation performance are evaluated.en_US
dc.language.isoen_USen_US
dc.subjectgenerative adversarial networksen_US
dc.subjectBayesian learningen_US
dc.subjectvariational autoencoderen_US
dc.subjectspeaker recognitionen_US
dc.titleBAYESIAN ADVERSARIAL LEARNING FOR SPEAKER RECOGNITIONen_US
dc.typeProceedings Paperen_US
dc.identifier.journal2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019)en_US
dc.citation.spage381en_US
dc.citation.epage388en_US
dc.contributor.department電機工程學系zh_TW
dc.contributor.departmentDepartment of Electrical and Computer Engineeringen_US
dc.identifier.wosnumberWOS:000539883100051en_US
dc.citation.woscount0en_US
Appears in Collections:Conferences Paper