標題: 基於雙工音高感知模型之神經網路旋律抽取演算法
A neural network based on duplex model of pitch perception for melody extraction
作者: 周歆
冀泰石
Chou Hsin
Che,Tai-Shih
工學院聲音與音樂創意科技碩士學位學程
關鍵字: 旋律抽取;卷積神經網路;音高感知模型;Melody extraction;covolutional
公開日期: 2017
摘要: 本論文根據聽覺的觀點提出利用類神經網路建構旋律抽取的方法,針對複音音樂進行旋律的抽取。根據傳統心理聲學音高分析理論,人在音高的解析分為頻譜模型和時間模型。在此論文中,我們先對個別模型進行探討並建構模型評比效能,觀察個別模型的訓練結果與聽覺理論是否相同,並依據結果建構出頻譜模型上的聽覺模板。再進一步針對頻譜模型上高頻諧音無法解析的缺失利用時間模型補足,建構出雙工模型。由實驗結果可知由時間模型補足頻譜模型無法解析的頻段有助於提升旋律抽取及音高判別。此實驗結果也證明以心理聲學為基礎來建構類神經網路確實可用於音樂資訊檢索的相關應用中。
In this thesis, we build up a melody extraction algorithm for polyphonic music using neural networks (NNs) by imitating human pitch perception. There are two pitch perception models, the spectral model and the temporal model, in accordance with whether harmonics are resolved or not by human hearing. Here, we first use NNs to implement each of the models and evaluate their performance in the task of melody extraction. Then, we compare training results of the implemented NNs to outcomes of the pitch perception theory. Finally, we combine the NNs of the spectral and temporal models to constitute the composite NN for the duplex model which complements the unresolved harmonics of the spectral model by the temporal model. Simulation results show that the proposed composite NN based on the duplex model of pitch perception is more effective in melody extraction than other conventional methods.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070451902
http://hdl.handle.net/11536/142518
Appears in Collections:Thesis