标题: 稀少性的输入资讯下所造成的分布不匹配问题在语者确认上的可靠度分析
Reliability Analysis Focusing on Sparse Input Data Caused Distribution Mismatch Problems for Speaker Verification
作者: 罗文辉
Wen-Hui Lo
陈信宏
Sin-Horng Chen
电机学院电信学程
关键字: 分布不匹配;混合高斯;稀少资料;语者确认;可靠度;distribution mismatch;GMM;sparse data;speaker verification;reliability
公开日期: 2005
摘要: 在语音辨识的领域上,往往需要使用少量的资料来对模型进行校估藉以使得模型更为强健(robust)。在语者确认的问题上,时常也需要面对资料量很少的情形之下从事语者模型的训练或测试的问题。
本研究首先提出稀少性资料(sparse data)的输入情况下,语者确认(speaker verification) 的问题在混合高斯 GMM (Gaussian mixture model)模型上的度量分数分布情形会产生和原先假设之间有落差的现象。本研究称此种现象为“分布不匹配(distribution mismatch)的问题”。针对此分布不匹配的问题,本研究首先提出使用截尾分布机率密度函数(truncated probability distribution function)的概念来近似。最后以此为基础,使用次序统计(order statistic)量的概念,推导得出一个以图(graph)为基础的联合分布机率模型;可以同时以机率的形式描述完整机率密度函数和截尾分布机率密度函数。
本研究建立一个以输入资料,资料之最小值,资料之分布范围大小,资料分布范围下的累积机率(覆盖率)及资料长度五个随机变数的联合分布机率密度函数。配合Gaussian quadrature 积分的取样概念,得出最少取样点下最精准的估计公式。最终的目的是希望以较优势的资讯量补偿在传统的统计推估上,因为资料量稀少所造成的估计标准误增加的问题。
最后,本研究以语者语句所获得之相对于UBM(universal background model)模型规一化平均分数对EER(equal error rate)进行假设检定(hypothesis test);由实验的结果得知,假设检定可以有效的减少语者确认时,因为抽样误差所造成的误判。
本研究的另外的主要成果在于确立稀少性的输入资讯下,如果要出现原先我们所假设的分布状况的可能性将是一个机率的随机行为,不再是一个假设性的确定性(deterministic)描述。本研究所得出的结论为:“当输入的样本数量小于20的时候,输入样本的覆盖范围和原来的假设PDF之间会互相匹配一致”的假设必须使用机率事件来描述才能完全掌握,而本研究完成了这个机率事件的描述公式。
ABSTRACT
It is a frequent facing problem for sparse data input to make a robust model testing with speech recognition. This phenomenon also encountered in the field of speaker verification with small data enrollment to do training or testing.
A new approach to sparse data input caused problems named “distribution mismatch(DM)” was addressed. The core of DM which was on account of the coverage of the probability distribution function(PDF) of the input data which are applied to GMM(Gaussian mixture model) score calculation is not full mapping to the original PDF assumption. There maybe be some differences between the original assumption PDF to the new one generated by sparse data input and we suggested to using the truncated probability distribution function for modeling this situation.
The most important addition to be made to what we have said about DM is that we have derived a new joint PDF based on graph theory with order statistic and the new formula would act as the truncated PDF or the original PDF measured by this joint PDF.
We succeed establishing the joint PDF which is compose of five random variables, including the input data, the minimum order of input data, the range of input data, the coverage of input data and the sample size of input data to estimate with Gaussian quadrature integration.
In the end of experiment, we take a hypothesis test to the equal error rate(EER) of the average score per frame of per sentence announced by the speaker normalized to the universal background model(UBM) and the same score announced by imposter normalize to the UBM model.
There are good evidences to show that hypothesis test could decrease the error probability for speaker verification. The other finding finished by this study is that we discover a special fact caused by sparse data input.
We usually regard the input random variable submitted to a certain probability distribution function but it is probabilistic to agree with this assumption when the input sample size is less than 20. Finally, we have derived the joint probability distribution function about it.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008867547
http://hdl.handle.net/11536/76546
显示于类别:Thesis


文件中的档案:

  1. 754701.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.