標題: | 國語電話語音查號系統之製作 An Implementation of Telephone Number Inquiry System |
作者: | 蔡佳穎 Tsai, Chia-yin 陳信宏 Chen Sin-Horng 電信工程研究所 |
關鍵字: | 語音辨認;Speech Recognition |
公開日期: | 1996 |
摘要: | 本論文製作完成一個線上國語語音查詢系統,它使用隱藏式馬可夫模型辨 認技術,可辨認1922台北地區金融保險機構名稱。研究主題包含兩個部分 ,第一部份針對前處理進行語音預切割及端點偵測,以一個遞迴式類神經 網路進行語音預切割,根據其輸出作為端點偵測,並將語音分割成聲母、 韻母、靜音以及過渡四大類,以加速後級語音辨認。第二部份為上層文法 結構之處理,分別就1922詞之詞典樹的建立,以及詞接詞的架構加以討論 。最後經327句語料測試結果,獲得了92.35%之辨認率,辨認速度為一秒 鐘語音花費5.04秒的辨認時間。 In this thesis, an on-line telephone number inquiry system is implemented on PC with a Dialogic D41/D telephone interface card and a 16-bit Sound Blaster card. It is an isolated-word speech recognition system operating under Windows 95 environment. It adopts the HMM technique to use a silence model, a breath model, 100 final-dependent initial models, and 39 final models as basic recognition units. The vocabulary contains 1922 names of banks and insurance companies in Taipei area. Two main topics are intensively studied in this work. One is the use of a new RNN-based pre-processing to detect the endpoints of the input speech as well as to pre-classify input frames into four broad classes of I (initial), F (final), S (silence), and T (transition). The purpose of the pre-classification is to speed up the following recognition process by restricting the search spaces for the three stable classes of I, F, and S. The other topic is related to the construction of pronunciation tree for recognizing these 1922 words. Two tree-construction methods are studied. One is a direct method which shares the common beginning parts of words. The resulting pronunciation tree consists of 5400 nodes. The other is a more sophisticated method which shares both the common beginning and ending parts of words,. The resulting pronunciation tree contains only 902 nodes. So it is more efficient. Performance of the system was examined by simulations using a test database containing 327 utterances. A recognition rate of 92.35% was achieved. The recognition speed is 5.04 seconds per 1-second speech. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT850436072 http://hdl.handle.net/11536/62152 |
Appears in Collections: | Thesis |