標題: Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification
作者: Tu, Youzhi
Mak, Man-Wai
Chien, Jen-Tzung
電機工程學系
Department of Electrical and Computer Engineering
關鍵字: Speaker verification (SV);domain adaptation;domain adversarial training;variational autoencoder;mutual information
公開日期: 1-一月-2020
摘要: Domain mismatch is a common problem in speaker verification (SV) and often causes performance degradation. For the system relying on the Gaussian PLDA backend to suppress the channel variability, the performance would be further limited if there is no Gaussianity constraint on the learned embeddings. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) that incorporates an InfoVAE into domain adversarial training (DAT) to reduce domain mismatch and simultaneously meet the Gaussianity requirement of the PLDA backend. Specifically, DAT is applied to produce speaker discriminative and domain-invariant features, while the InfoVAE performs variational regularization on the embedded features so that they follow a Gaussian distribution. Another benefit of the InfoVAE is that it avoids posterior collapse in VAEs by preserving the mutual information between the embedded features and the training set so that extra speaker information can be retained in the features. Experiments on both SRE16 and SRE18-CMN2 show that the InfoVDANN outperforms the recent VDANN, which suggests that increasing the mutual information between the embedded features and input features enables the InfoVDANN to extract extra speaker information that is otherwise not possible.
URI: http://dx.doi.org/10.1109/TASLP.2020.3004760
http://hdl.handle.net/11536/154890
ISSN: 2329-9290
DOI: 10.1109/TASLP.2020.3004760
期刊: IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
Volume: 28
起始頁: 2013
結束頁: 2024
顯示於類別:期刊論文