BAYESIAN ADVERSARIAL LEARNING FOR SPEAKER RECOGNITION

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Chien, Jen-Tzung	en_US
dc.contributor.author	Kuo, Chun-Lin	en_US
dc.date.accessioned	2020-07-01T05:21:48Z	-
dc.date.available	2020-07-01T05:21:48Z	-
dc.date.issued	2019-01-01	en_US
dc.identifier.isbn	978-1-7281-0306-8	en_US
dc.identifier.uri	http://hdl.handle.net/11536/154481	-
dc.description.abstract	This paper presents a new generative adversarial network (GAN) which artificially generates the i-vectors to compensate the imbalanced or insufficient data in speaker recognition based on the probabilistic linear discriminant analysis. Theoretically, GAN is powerful to generate the artificial data which are misclassified as the real data. However, GAN suffers from the mode collapse problem in two-player optimization over generator and discriminator. This study deals with this challenge by improving the model regularization through characterizing the weight uncertainty in GAN. A new Bayesian GAN is implemented to learn a regularized model from diverse data where the strong modes are flattened via the marginalization. In particular, we present a variational GAN (VGAN) where the encoder, generator and discriminator are jointly estimated according to the variational inference. The computation cost is significantly reduced. To assure the preservation of gradient values, the learning objective based on Wasserstein distance is further introduced. The issues of model collapse and gradient vanishing are alleviated. Experiments on NIST i-vector Speaker Recognition Challenge demonstrate the superiority of the proposed VGAN to the variational autoencoder, the standard GAN and the Bayesian GAN based on the sampling method. The learning efficiency and generation performance are evaluated.	en_US
dc.language.iso	en_US	en_US
dc.subject	generative adversarial networks	en_US
dc.subject	Bayesian learning	en_US
dc.subject	variational autoencoder	en_US
dc.subject	speaker recognition	en_US
dc.title	BAYESIAN ADVERSARIAL LEARNING FOR SPEAKER RECOGNITION	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.journal	2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019)	en_US
dc.citation.spage	381	en_US
dc.citation.epage	388	en_US
dc.contributor.department	電機工程學系	zh_TW
dc.contributor.department	Department of Electrical and Computer Engineering	en_US
dc.identifier.wosnumber	WOS:000539883100051	en_US
dc.citation.woscount	0	en_US
顯示於類別：	會議論文