完整後設資料紀錄
DC 欄位語言
dc.contributor.author馬偉雲en_US
dc.contributor.authorWei-Yun Maen_US
dc.contributor.author劉啟民en_US
dc.contributor.authorChi-Min Liuen_US
dc.date.accessioned2014-12-12T02:20:21Z-
dc.date.available2014-12-12T02:20:21Z-
dc.date.issued1998en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#NT870392083en_US
dc.identifier.urihttp://hdl.handle.net/11536/64108-
dc.description.abstract中文連續語音辨認技術要能夠實際應用到電腦輸入法,必須要有高辨識率以及快速的辨識時間才能達成。如何在維持高辨識率的情形之下,仍能大幅度的節省辨識時間即是本論文的研究目標。 在本論文中,使用中文的詞作為搜尋單位,在維特比搜尋法中,結合詞雙連語言模型做連續語音辨識,如此可同時整合聲學處理(Acoustic Processing) 與語言處理(Linguistic Processing)而得到整體最佳結果(Global Optimum) 。在Pentium- 450M Hz的測試環境下,使用20句連續中文語音作測試,字辨識率可達47.87%,平均一句話的辨識時間為13.4sec。這種作法雖然在辨識率上能夠得到很好的表現,但其搜尋空間十分龐大,以聲學處理來說,搜尋空間跟詞庫大小成正比。以語言處理來說,搜尋空間跟詞庫大小平方成正比,如此龐大的搜尋空間,將會嚴重影響辨識時間。因此本論文提出兩種方法來解決此一問題。第一種針對聲學處理設法縮小搜尋空間,改善傳統的光束搜尋法(Beam Search)固定光束寬的缺點,而提出一種能隨時間而動態調整光束寬的作法。字辨識率可達48.94%,辨識時間為9.93 sec。第二種針對語言處理設法縮小搜尋空間,在以詞為單位的辨識之前,先行用極快的方法,偵測哪些時間點是可能的詞和詞交接處。在這些時間點上才作語言處理的計算,來達到縮小搜尋空間的目的。字辨識率可達47.87%,平均辨識時間為8.54 sec。最後此兩種方法結合,可得到最佳的結果。字辨識率達48.94%,平均辨識時間為7.13 sec。zh_TW
dc.description.abstractHigh recognition rate and quick response time are two fundamental requests in continuous speech recognition. In this thesis, we study the way to speedup the recognition time while retain the same recognition rate. In this thesis, we apply one-pass Viterbi algorithm to recognizing Mandarin sentences. We choose the word as the recognition unit and integrate word bigram into Viterbi algorithm. In the test environment of Pentium-450M Hz, our recognition rate is 47.87% and average recognition time is 13.4 sec for 20 sentences. Although the accuracy of this method is good, but the search space is very large. In acoustic processing, the search space is related to the vocabulary size. In linguistic processing, the search space is related to the square of vocabulary size. Such a large search space will increase recognition time seriously. Therefore we present two methods to solve this problem. The first method is the dynamic beam search which adjust beam width according to the current time to reduce search space in acoustic processing. In this method, the recognition rate is 48.94% and the average recognition time is 9.93 sec. The second method tries to reduce search space in linguistic processing. Before Viterbi search, we apply some fast algorithms to detect frames which could be the boundaries between words. Then, we apply the bigram model just in these frames. The recognition rate of this method is 47.87% and the average time is 8.54sec. Finally, we integrate these two methods and have the recognition rate, 48.94% and the average time, 7.13 sec.en_US
dc.language.isozh_TWen_US
dc.subject連續語音辨認zh_TW
dc.subject維特比zh_TW
dc.subject光束搜尋法zh_TW
dc.subject動態光束搜尋法zh_TW
dc.subjectcontinuous speech recognitionen_US
dc.subjectviterbien_US
dc.subjectbeam searchen_US
dc.subjectdynamic beam searchen_US
dc.title連續語音辨認的速度改進研究zh_TW
dc.titleSpeed Improvement for Continuous Speech Recognitionen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文