Title: 國語電話語音查號系統之製作
An Implementation of Telephone Number Inquiry System
Authors: 蔡佳穎
Tsai, Chia-yin
陳信宏
Chen Sin-Horng
電信工程研究所
Keywords: 語音辨認;Speech Recognition
Issue Date: 1996
Abstract: 本論文製作完成一個線上國語語音查詢系統,它使用隱藏式馬可夫模型辨
認技術,可辨認1922台北地區金融保險機構名稱。研究主題包含兩個部分
,第一部份針對前處理進行語音預切割及端點偵測,以一個遞迴式類神經
網路進行語音預切割,根據其輸出作為端點偵測,並將語音分割成聲母、
韻母、靜音以及過渡四大類,以加速後級語音辨認。第二部份為上層文法
結構之處理,分別就1922詞之詞典樹的建立,以及詞接詞的架構加以討論
。最後經327句語料測試結果,獲得了92.35%之辨認率,辨認速度為一秒
鐘語音花費5.04秒的辨認時間。
In this thesis, an on-line telephone number inquiry system is
implemented on PC with a Dialogic D41/D telephone interface card
and a 16-bit Sound Blaster card. It is an isolated-word speech
recognition system operating under Windows 95 environment. It
adopts the HMM technique to use a silence model, a breath
model, 100 final-dependent initial models, and 39 final models
as basic recognition units. The vocabulary contains 1922
names of banks and insurance companies in Taipei area. Two
main topics are intensively studied in this work. One is the use
of a new RNN-based pre-processing to detect the endpoints of
the input speech as well as to pre-classify input frames into
four broad classes of I (initial), F (final), S (silence), and
T (transition). The purpose of the pre-classification
is to speed up the following recognition process by restricting
the search spaces for the three stable classes of I, F, and S.
The other topic is related to the construction of pronunciation
tree for recognizing these 1922 words. Two tree-construction
methods are studied. One is a direct method which shares
the common beginning parts of words. The resulting
pronunciation tree consists of 5400 nodes. The other is a more
sophisticated method which shares both the common beginning and
ending parts of words,. The resulting pronunciation tree
contains only 902 nodes. So it is more efficient. Performance
of the system was examined by simulations using a test database
containing 327 utterances. A recognition rate of 92.35% was
achieved. The recognition speed is 5.04 seconds per 1-second
speech.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT850436072
http://hdl.handle.net/11536/62152
Appears in Collections:Thesis