Title: 基於聽覺感知使用頻率調變模板之和弦辨識
Auditory Perception Based Spectral Modulation Templates for Chord Recognition
Authors: 蔡勖正
冀泰石
Chi, Tai-Shih
工學院聲音與音樂創意科技碩士學位學程
Keywords: 聽覺感知;和弦辨識;轉位和弦;頻率調變;auditory perception;chord recognition;inversion;spectral modulation
Issue Date: 2012
Abstract: 本論文以聽覺感知現象為架構,結合色度向量與頻率調變域上之新音樂特徵,提出一個基於模板、包含轉位和弦之和弦辨識系統,整個系統主要分為初步和弦種類分群與進階轉位和弦辨識兩個階段。在初步和弦種類分群之階段,音樂訊號將轉換為一組十二維之色度向量,用以表示不同時間下之音高成分,透過計算其與預置的三十六個和弦模板間之歐式距離,將和弦分為大三和弦、小三和弦與減三和弦三類,並同時決定根音。進階轉位和弦辨識則是經由反摺積萃取頻率調變域上每一種原位、轉位和弦的音程組合,透過動態拉伸或壓縮計算其與預置的五個轉位和弦模板間之歐氏距離,進一步決定和弦轉位性質。在電腦模擬的部分,除了針對十一種樂器進行單一音色與混合音色之和弦辨識,亦透過亂數振幅比例混合以及添加打擊樂器以逼近真實情況。另有一聽覺實驗,受試者包含四位音樂所碩士班學生及四位一般音樂愛好者,實驗結果驗證了本論文所提出之架構與人類聽覺感知現象相符。
In this thesis, we propose a template-based chord recognition system including identifying inversions. The system cascades chroma vectors with a new proposed feature in spectral modulation domain based on the phenomenon of auditory perception. It can be divided into two stages - preliminary classification for roots and chord types and advanced identification for inversions. In the preliminary classification stage, a music signal is first translated into a chromagram which represents the pitch content over time and then classified into major, minor or diminished triads with determined roots by calculating the Euclidean distance between the chroma vectors and 36 prepared chord templates. In the advanced identification stage, since chords containing different interval combinations correspond to different patterns in spectral modulation domain, we can extract the patterns by deconvolution and calculate the Euclidean distance with 5 prepared inversion templates to identify inversons. The patterns are compressed or stretched dynamically during the procedure. In our simulation, we not only proceed chord recognition with each of 11 different instruments and different combinations of them but also attempt to approach real cases by mixing chord tones with random ratio and adding drums and percussions. Besides, we also proceed a listening test to four graduate students from institute of music and four amateurs. The experimental results show that the proposed framework is highly consistent with the phenomenon of auditory perception.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079902501
http://hdl.handle.net/11536/48964
Appears in Collections:Thesis