標題: | 利用統計方法之基週期偵測器與國語連續語音聲調辨認 Statistical Pitch Detection and Tone Recognition in Mandarin Speech |
作者: | 曹登鈞 Teng-Chun TSAO 王逸如 Dr. Yih-Ru Wang 電信工程研究所 |
關鍵字: | 基頻軌跡;候選值;有聲/無聲;維特比搜尋;多層式類神經網路;線性回歸;刪除型;插入性;pitch contour;candidate;voiced/unvoiced;viterbi search;MLP;linear regression;deletion;insertion |
公開日期: | 2001 |
摘要: | 在本論文中,我們提出一套藉由統計方式來求取語音基頻軌跡的方法,並將之應用於國語連續語音之聲調辨認器。首先,我們將基頻抽取的工作視為找尋一條最有可能之基頻軌跡,藉由適當的建立各種音框屬於有聲/無聲的機率模型,以及各音框間基頻值轉換的機率模式,我們可將基頻軌跡求取之問題轉換成為一個最佳相似度(Maximum Likelihood, ML)的問題。由實驗證明,上述方法所求取之基週軌跡較現有方法為佳。其次,我們使用多層式類神經網路(MLP)辨認器去對所求出基頻軌跡做國語聲調,可達到77%的辨認率。 In this thesis, we proposed a statistical method to find the more reliable pitch contour of continue Mandarin speech. After finding the pitch contour, a neural network based tone recognizer was used to find the tone recognition results. First, the pitch contour abstraction task was treat as finding the most probable pitch contour in lots of candidates decided in auto-correlation method. By properly modeling the probability models of unvoiced/voiced of a frame, pitch transition probability models between frames/segments, the pitch detection can be change into a maximum likelihood (ML) problem. And, in the experiments, we can find the performance of the proposed statistical pitch detector will be better thane the pitch detection method in ESPS package. Finally, a multi-layer perceptron was used as a tone recognizer, 77% tone recognition rate was achieved. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#NT900435067 http://hdl.handle.net/11536/68944 |
Appears in Collections: | Thesis |