標題: 基於類神經網路及樂理知識限制之印刷樂譜辨識系統
A Printed Music Recognition System Based on Neural Network and Musical Knowledge Limitation
作者: 廖威凱
Wei-Kai Liao
林昇甫
Sheng-Fuu Lin
電控工程研究所
關鍵字: 印刷樂譜辨識;類神經網路;樂理知識;printed music recognition;neural network;musical knowledge
公開日期: 2006
摘要: 利用光學樂譜辨識(optical music recognition)的方式可以大大地簡化音樂資料數位化的繁雜手續,使用者不需具備任何彈奏技巧,並且可以去除人工編曲費時且錯誤率較高的缺點。對於樂譜中辨識音樂符號的諸多困難,本論文提出一個基於類神經網路(neural network)以及樂理知識限制(musical knowledge limitation)的印刷樂譜辨識系統,可以系統化的處理樂譜辨識的繁雜工作。系統將樂譜中的音樂符號分成數類,針對形態最為多變的含有符頭之音樂物件辨識工作,應用樂理知識限制歸納出的兩個規則,首先依據五線譜間距大小適當的調整符頭(note)樣版的大小,利用快速正規化互相關運算(fast normalized cross-correlation)做特徵的比對。符桿(stem)的位置則由垂直方向連續長度編碼(vertical run-length encoding)的技巧來尋找。然後判斷符桿和符頭的相對位置,可以得到符尾(flag)或符衍(beam)的確切位置,並且進一步的進行音符時值的分析。對於不含符頭之音樂物件的辨識工作,我們對這類音樂符號抽取適當的幾何特徵,透過事先訓練好的類神經網路辨識核心進行符號的分類。在其他符號的辨識工作上,拍號(time signature)的辨識引入光學字元辨識(optical character recognition)中的技術,先對擷取出來的數字做線性正規化至統一的像素大小,經細線化(thinning)之後的影像視為一個高維度的特徵向量,經由改良式的最近鄰分類法(modified nearest neighbor classification),可以改善原始最近鄰分類法在高維度特徵空間下辨識率降低的影響。 實驗結果顯示,本論文所提出的印刷樂譜辨識系統對於不同實驗情境下的辨識率影響不大,證明本論文所提出的方法具有不錯的強健性。
Using optical music recognition can simplify the digitalization of music data greatly. By the proposed method, the users can easily transform the printed music into computer-readable format without any piano skills and ensure the accuracy. In this thesis, in order to deal with the complicated process of music recognition, we propose a printed music recognition system based on neural network and musical knowledge limitation. At first, the input music symbols in the score were divided into three kinds: (1) symbols with note, (2) symbols without note, and (3) other symbols. For symbols with note, the templates of note are resized properly according to the stave space. Then, feature matching can be accomplished by fast normalized cross-correlation. The position of stem can be determined by vertical run-length encoding. Finally, by comparing the relative positions of stem and note, the precise position of flag and beam can be decided and be used to analyze the note length. Secondly, the symbols without note can be classified by the extracted geometric features through neural network. Third, time signature recognition can be achieved by optical character recognition technology. The input digits are linearly normalized and thinning. Then, each of them can be considered as a high dimension features. The errors caused by high dimension feature space can be improved by the modified nearest neighbor classification. As can be seen, the experimental results show that the proposed printed music recognition system is robust and has good performance even under different conditions.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009312572
http://hdl.handle.net/11536/78259
Appears in Collections:Thesis