行動語音人機介面之研究

標題:	行動語音人機介面之研究 A study of mobile interactive voice response system
作者:	何依信張文輝電機學院通訊與網路科技產業專班
關鍵字:	播放緩衝器;移動式自組網路;聽覺最佳化;割線演算法;playout buffer;MANET;perceptual optimization;secant method
公開日期:	2006
摘要:	人性化隨身資訊服務是未來的發展趨勢，其關鍵在於開發一聲控操作的語音人機介面。本論文在網際網路環境，建構一分散式語音辨認系統，再根據辨認結果回傳特定的有聲資訊給用戶。網路語音通訊最重要的課題是服務品質管理，特別是封包漏失、傳輸延遲及延遲顫動。為了補償封包漏失，我們採用多重敘述編碼架構，透過兩個獨立的網路通道傳送語音封包。至於延遲擾動的解決方案，一般是在接收端加入一播放緩衝器暫存語音封包，再彈性調整每個語音封包的播放時間。由於網路延遲在話務中間的變動，語音封包的晚到漏失率與其緩衝延遲及之間存在一個最佳化權衡的問題。我們將在多重敘述編碼架構下，根據客觀的音質預估模型，針對每個獨立封包的播放延遲進行音質最佳化調整。 The purpose of this research is to develop an interactive voice response system that allows drivers to use voice-controlled commands to access the information server through the internet. We first implement a distributed speech recognition system, in which speech features extracted from a local front-end are transmitted through a data channel to a remote back-end recognition server. Another important issue to address is the playout buffer design, which is often used at the receiver to smooth out the jitter for timely reconstruction of the speech. We formulate the adaptive playout scheduling of multiple voice streams as a constrained optimization problem that leads to a better balance between end-to-end delay and packet loss. Also proposed is a perceptually motivated optimization criterion and a practically feasible algorithm for the playout buffer design.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT009492502 http://hdl.handle.net/11536/37928
顯示於類別：	畢業論文

文件中的檔案：

250201.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。