標題: 具聽音辨位和語音增強及辨識的唱歌機器人
Source Localization, Speech Enhancement and Recognition of a Singing Robot
作者: 桂振益
Kuei, Chen-Yi
Bai, Ming-Sian
關鍵字: 聽音辨位;語音增強;語音辨識;機器人;DOA Estimation;Speech Enhancement;Speech Recognition;Robot
公開日期: 2010
摘要: 現今的機器人工業如雨後春筍般蓬勃發展,技術更是日新月異,各種功能的機器人舉凡保全機器人、軍事機器人、居家看護機器人、娛樂機器人等琳瑯滿目,而隨著社會水準以及人們對於生活品質要求的提高,娛樂機器人今日佔有相當重要的地位。本論文提出了一種點唱機器人,會追蹤且同時轉到使用者的方向,所以此點唱機器人必須具備聽音辨位及語音辨識的能力。其中聽音辨位的方法包括以物體轉移函數為基礎的辨位方法和交互相關及廣義交互相關;語音辨識則在萃取出特徵參數之後採用動態時軸校正的方法比對並且辨識。而為了要讓使用者命令的聲音純化以提高辨識率,我們採用語音增強的技術,包括陣列訊號處理和以相位差為基礎的語音增強方法。以上提及的演算法我們將擇其優者整合在樂高NXT機器人,而其操作平台為以Windows為介面的資料擷取系統。
Nowadays, there are a variety of functional robots included security robot, military robot, household robot and recreational robot, etc. With social progress and the attention of quality life, entertainment undertakings play an important role recently. In this thesis, we present a nickelodeon robot with simultaneous human-tracking. Therefore, the robot contains source localization and speech recognition techniques. The methods of source localization include object-related transfer function (ORTF) based, cross correlation (CC) and generalized cross correlation (CC) method. We recognize words by employing dynamic time warping (DTW) to do dynamic matching after feature extraction. For the purpose of increasing the purity of the command voice, we adopt speech enhancement which contains array signal processing and phase difference (PD) method. All algorithms we take the best one of each purpose to implement on the LEGO NXT robot controlled by Windows-based NI DAQ system.


