標題: 以最大似然機率線性回歸法建立線上層級體系語者調適語音辨認
MLLR-based Online Hierarchical Speaker Adaptation for Speech Recognition
作者: 周樂生
Luke Chou
陳信宏
Sin-Horng Chen
電信工程研究所
關鍵字: 線性回歸;線上;層級體系;語者調適;語音辨認;MLLR;online;hierarchical;speaker adaptation;speech recognition
公開日期: 2000
摘要: 本論文嘗試建立良好的線上層級體系語者調適語音辨認系統,分別在離線和線上辨認時加以改進,期望能提高調適後辨認效能,並減少計算量以提高效率。本研究以MLLR技術為基礎,在離線時依照分裂標準建立分類樹,線上辨認時依照合併標準,以層級方式選取適當類別後,分別計算轉換矩陣和轉換模型,以該語者的轉換模型辨認。我們以不同分裂標準建立離線的分類樹,線上辨認時也採用不同合併標準。離線分裂標準有二:一是似然度的比值,二是似然分數的改進。線上合併標準有三:一是似然度的比值,二是語料量,三是類別平均值的變異數。實驗結果顯示,分裂標準中,標準二比標準一辨認效果好;合併標準中,標準三最好,標準二其次,標準一最差;分裂標準和合併標準不同,辨認效果會打折扣。
In this thesis, we try to build a good online hierarchical speaker adaptation speech recognition system. By offline and online improving, we hope to increase the performance after adaptation and decrease calculation capacity to increase efficiency. The study is based on MLLR technique; according to splitting criterions, we can build online class trees; according to merging criterions, we choose proper classes in hierarchical way, and then calculate their own transformation matrices to transform models, proceed to online recognition by adapted models. We use different splitting criterions to build offline class trees, as well as different merging criterions to proceed to online recognition. There are two splitting criterions: likelihood ratio and likelihood gain. There are three merging criterions: likelihood ratio, adaptation data and variance of class mean. Experimental results shows that splitting criterion No.2 is better than splitting criterion No.1, merging criterion No.3 is better than merging criterion No.2 better than merging criterion No.3; if the splitting criterion and the merging criterion is different, the performance will be worse.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT890435040
http://hdl.handle.net/11536/67319
顯示於類別:畢業論文