标题: | 两阶段消除回响及环境噪音演算法 A Two-stage Algorithm for De-reverberation and De-noise |
作者: | 黄群 冀泰石 Huang, Chun Chi, Tai-Shih 电机工程学系 |
关键字: | 回响消除;噪音消除;空间脉冲响应;调变域频谱;类神经网路;复数理想遮罩;De-reverberation;De-noise;Room impulse response;Modulation spectrum;Deep neural network;Complex ideal mask |
公开日期: | 2017 |
摘要: | 消除回响以及环境噪音一直以来是语音讯号处理中相当重要的议题,本论文中,我们首先针对回响讯号进行一连串的近似及简化,并使用深度学习(deep learning)来达到映射(mapping)及遮蔽(masking)的方法以学习消除回响的过程。藉由加入参考调变振幅(reference modulation magnitude)于输入中并使用混洗(shuffling)方法,对于未见过(unseen)的回响环境也能有很好的消回响效果。接着为了同时消除回响及环境噪音,我们藉由两阶段的处理分别于调变域(modulation domain)消除回响及于振幅频谱(magnitude spectrogram)消除噪音,并于第一阶段训练时产生出的附加性噪音(additive noise)于第二阶段一并消去,希望藉由多阶段的学习以提升重建的效果。从人类听觉(human hearing)角度出发,我们同时考虑语音理解度(intelligibility)及语音品质(quality)为重要的评量标准,藉由与其他演算法做比较并且尝试使用不同的资料库,以分析各种架构的优缺点。 De-reverberation to cancel the reverberant effect and de-noise have always been important topics in speech signal processing. In this thesis, we first analyze the re-verberant effect through a series of approximations and simplifications and use deep learning techniques to apply mapping and masking methods for de-reverberation. By using the reference modulation magnitude derived from a different sentence as the input to the neural network during training, the neural network performs well on de-reverberation for unseen environments. Next, to handle the real-world problem, we propose a two-stage processing which de-reverberates in the modulation domain and de-noises in the spectrogram domain respectively. The artificial additive noise produced from the first de-reverberation stage will also be canceled in the second stage along with environmental additive noise. The reconstruction of speech can be improved by multiple-stage learning. For human hearing applications, speech intelli-gibility and speech quality are considered as important evaluation criteria. Conse-quently, we analyze the advantages and disadvantages of each network structure by comparing the scores of speech intelligibility and quality using two speech corpora. |
URI: | http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070450730 http://hdl.handle.net/11536/141942 |
显示于类别: | Thesis |