國語電話語音查號系統之製作

Full metadata record

DC Field	Value	Language
dc.contributor.author	蔡佳穎	en_US
dc.contributor.author	Tsai, Chia-yin	en_US
dc.contributor.author	陳信宏	en_US
dc.contributor.author	Chen Sin-Horng	en_US
dc.date.accessioned	2014-12-12T02:17:50Z	-
dc.date.available	2014-12-12T02:17:50Z	-
dc.date.issued	1996	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#NT850436072	en_US
dc.identifier.uri	http://hdl.handle.net/11536/62152	-
dc.description.abstract	本論文製作完成一個線上國語語音查詢系統，它使用隱藏式馬可夫模型辨認技術，可辨認1922台北地區金融保險機構名稱。研究主題包含兩個部分，第一部份針對前處理進行語音預切割及端點偵測，以一個遞迴式類神經網路進行語音預切割，根據其輸出作為端點偵測，並將語音分割成聲母、韻母、靜音以及過渡四大類，以加速後級語音辨認。第二部份為上層文法結構之處理，分別就1922詞之詞典樹的建立，以及詞接詞的架構加以討論。最後經327句語料測試結果，獲得了92.35%之辨認率，辨認速度為一秒鐘語音花費5.04秒的辨認時間。 In this thesis, an on-line telephone number inquiry system is implemented on PC with a Dialogic D41/D telephone interface card and a 16-bit Sound Blaster card. It is an isolated-word speech recognition system operating under Windows 95 environment. It adopts the HMM technique to use a silence model, a breath model, 100 final-dependent initial models, and 39 final models as basic recognition units. The vocabulary contains 1922 names of banks and insurance companies in Taipei area. Two main topics are intensively studied in this work. One is the use of a new RNN-based pre-processing to detect the endpoints of the input speech as well as to pre-classify input frames into four broad classes of I (initial), F (final), S (silence), and T (transition). The purpose of the pre-classification is to speed up the following recognition process by restricting the search spaces for the three stable classes of I, F, and S. The other topic is related to the construction of pronunciation tree for recognizing these 1922 words. Two tree-construction methods are studied. One is a direct method which shares the common beginning parts of words. The resulting pronunciation tree consists of 5400 nodes. The other is a more sophisticated method which shares both the common beginning and ending parts of words,. The resulting pronunciation tree contains only 902 nodes. So it is more efficient. Performance of the system was examined by simulations using a test database containing 327 utterances. A recognition rate of 92.35% was achieved. The recognition speed is 5.04 seconds per 1-second speech.	zh_TW
dc.language.iso	zh_TW	en_US
dc.subject	語音辨認	zh_TW
dc.subject	Speech Recognition	en_US
dc.title	國語電話語音查號系統之製作	zh_TW
dc.title	An Implementation of Telephone Number Inquiry System	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
Appears in Collections:	Thesis