跨語料庫之邊界模型對自動化切割的改善

Title:	跨語料庫之邊界模型對自動化切割的改善 Refining Segmental Boundaries by Cross-Database Boundary Model
Authors:	賴佳鴻陳信宏 Lai, Jia-Hong Chen, Sin-Horng 電信工程研究所
Keywords:	自動化切割;強迫對齊法;邊界模型;跨語料庫;語者調適;課程語料;automatic segmentation;forced alignment;boundary model;cross-database;speaker adaptation;lecture speech
Issue Date:	2017
Abstract:	本研究提出一個二階段的自動化切割方法，用現有的database分別去訓練傳統的GMM-HMM聲學模型和GMM-based邊界模型對一個全新的目標語料庫做自動化的音節切割處理。在第一個階段先使用GMM-HMM做強迫切割取得基本的音節層級之切割資訊，而後於第二階段利用邊界模型去針對前者在局部的範圍內做邊界位置的事後調整。在邊界模型的部分我們會從目標語料庫中選出少量的語句來做語者調適，讓模型參數的統計特性與測試語料一致，增強其做自動化切割修正的效能。實驗中我們以交大OCW所開設的課程語音作為自動化切割處理的測試語料，使用TCC300語料庫來訓練GMM-HMM基線模型，再使用陶小姐朗讀式快速語料庫及一部份的OCW語料來訓練邊界模型，希望藉此發展出一套能針對新語料庫進行高度自動化處理的音節切割標記系統。 This thesis proposed a 2-stage automatic segmentation method, using database available to train traditional GMM-HMM acoustics model and GMM-based boundary model, aimed for processing syllable-level segmental boundaries of a new target database automatically. We got the initial syllable-level boundaries information by HMM-based forced alignment at the first stage, and then introduce boundary model to do post-refinement upon each boundary within a local range at second stage. A small number of utterances were treated as adaptation data for speaker adaptive training of boundary model so that the statistics of model parameters can match that of the test data, which would enhance the segmental refinement. In the experiment, lecture videos and captions from National Chiao Tung University Open Course Website (NCTU OCW) were choosen as the source of target database, while TCC300 training set was used for training GMM-HMM baseline model; Fast brodacast read speech database and part of the OCW training set was used for boundary model training, including background and speaker adaptation. By this, we would develop a highly-automatic syllable-level segmental boundary labeling system.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070360215 http://hdl.handle.net/11536/140327
Appears in Collections:	Thesis