標題: 馮米塞斯函數類神經網路及其於頭部相關位置脈衝響應之模型建立
A Von Mises Basis Function Network for Head-Related-Impulse Response Modeling
作者: 方柏凱
Bo-Kai Fang
林進燈
Chin-Teng Lin
電控工程研究所
關鍵字: 頭部相關位置轉移函數;類神經網路;馮米塞斯函數;環繞音效;聲音合成;主值分析法;HRIR;head-related-impluse-response;Von Mises function;surround sound;synthesize sound;PCA;neural network
公開日期: 2002
摘要: 在本論文中,我們建立了一個3D環繞音效的系統模型。首先我們採用麻省理工學院所建立的頭部相關位置脈衝響應(Head-Related-Impulse-Response)為資料庫,然後把此資料庫加以分析及壓縮,建立起一套可以產生任何方向聲音源的系統,所使用的方法為主成份分析(PCA)以及類神經網路。由於鄰近方向的HRIR之間的變化差異度不大,所以整個HRIR資料庫可以當成是相關的隨機變數所建立而成。而主成份分析主要的作用就是將HRIR資料庫,經某種線性組合轉換成新的一組無相關的隨機變數。這樣的線性轉換就會使得原HRIR資料庫的變異集中在少數的新隨機變數上。因此欲簡化過多而複雜HRIR,則可選取少數變異大的新變數,而捨棄變異小的新變數。其中代表每組HRIR線性組合的權重,稱為SCFs(Spatial Feature Extractions)。 為了得到未取樣位置的SCFs,我們使用了類神經網路來訓練已有取樣位置的SCFs來內插出未取樣位置的SCFs。由於SCFs在空間中的分布形狀和馮米塞斯函數(Von Mises Function)相近,所以我們利用徑向基底函數網路(Radial Basis Function Network)的架構,把高斯基底函數換成馮米塞斯函數來訓練模擬各個來自不同經度緯度的SCFs。於是在經過不同學習法則的比較後,發現利用正交化最小平方學習法則(Orthogonal Least Square Learning Algorithm)可以用最少個數的馮米塞斯函數當作隱藏層節點(hidden layer node),會得到比其他學習法則小的偏差均方根(Root Mean Square Error)。如此一來我們即可達到降低資料儲存量的目的。同時也可以利用此類神經網路來平滑(smoothing)及內插整個HRIR音場。最後根據聲響心理學的各種不同的遮罩理論,把聲音源分別和模擬出來左右耳的HRIR做摺積,就能產生出3D的環繞音效。
In this thesis, we build a virtual 3D environment. We use the MIT head-related impulse responses (HRIRs) as our database. More specifically, it deals with synthesis of 3D moving sound to be supplied binaurally through headphones. Then, We propose an efficient method, which can reduce the information size and interpolate the nonsampling HRIR while retaining high resolution of localization. First, in this model the HRIRs are expressed as weighted combinations of a set of eigentransfer functions. The weights applied to each eigentransfer functions only of spatial location and are thus termed SCFs (Spatial Characteristic Functions). The SCFs that we extract, however, are restricted to the specified azimuths and elevations that the HRIR database records. The SCFs for the nonsample spatial location are unknown. So we use the architecture of radial basis function network (RBFN) with Von Mises function as activation functions for classification of the spatial characteristic features. This neural network is called VMBFN (Von Mises Basis Functions Network). The VMBFN used here can solve the problem of approximation and interpolation. When using the orthogonal least square learning algorithm to train VMBFN, the RMSE (Root Mean Square Error) is minimum. Through convolution the source sound with the simulated HRIR, we can synthesize the spatial sound over headphone.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT910591016
http://hdl.handle.net/11536/70999
顯示於類別:畢業論文