Full metadata record
DC FieldValueLanguage
dc.contributor.author陳瑞正en_US
dc.contributor.authorJui-Cheng Chenen_US
dc.contributor.author林進燈en_US
dc.contributor.authorChin-Teng Linen_US
dc.date.accessioned2014-12-12T02:27:39Z-
dc.date.available2014-12-12T02:27:39Z-
dc.date.issued2004en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT009212550en_US
dc.identifier.urihttp://hdl.handle.net/11536/68457-
dc.description.abstract在本論文中我們提出了一個針對音頻信號之分類與切割的系統,此系統可將含有靜音、純語音、純音樂以及歌曲之檔案,根據其類型加以分類與切割。我們針對上述各種音訊的特徵的作分析與比較,並根據這些分析與比較的結果,設計一套分類流程將輸入的音訊分兩階段依序完成分類與切割。一開始的靜音偵測根據一個門檻值標示出音訊中屬於靜音的部分。之後,第一階段將輸入音訊中非靜音部分分為純語音與「含有音樂成分」兩類,第二階段將在第一階段中被歸類為「含有音樂成分」的部分,進一步分為純音樂以及歌曲。為了解決傳統特徵在進行純音樂與歌曲分類時分類效果不佳的問題,本論文提出了一個名為「前三峰值之頻率變化量(FVTP)」的新特徵。此特徵描述了歌曲的頻譜結構會隨著時間而顯著地改變而純音樂之頻譜結構改變量相對較小之特性。因此該特徵能在進行純音樂與歌曲分類時,改善分類效果不佳的問題。而在分類器的選用方面,本系統採用一前向式自我建構類神經模糊推理網路(SONFIN)做為核心分類器。該網路具有可自我建構並調整的架構與參數學習的功能,以及優異的模糊類神經推論過程。我們利用這些特性達到較佳之分類結果。實驗結果顯示,本系統可達到平均90%以上的分類正確率。因此,本系統可作為許多如語音辨識、語者辨識等應用系統的前端處理,使輸入這些應用系統的內容符合系統要求以提升應用系統的效能。zh_TW
dc.description.abstractIn this thesis, we proposed an audio classification and segmentation system. The system is used to classify and segment audio files which contain silence, pure speech, pure music, and song according to their contents. We analyzed and compared features of audio signals and designed a two-stage classification flow to classify and segment input audio signals sequentially. The flow starts with the silence detection which indexes silence according to a threshold. Then, stage 1 classifies the nonsilence parts into pure speech and “with music components”. Stage 2 classifies the “with music components” parts in stage 1 into pure music and song. In order to solve the problem that traditional features do not work well when it comes to pure music/song classification, we proposed a novel feature named FVTP. The feature describes the property that variations of the spectrum structure are larger for song but smaller for pure music. Thus, the feature can improve the performance of pure music/song classification. On the other hand, an on-line self-constructing neural fuzzy inference network (SONFIN) was adopted as the main classifier in this system. The SONFIN finds its optimal structure as well as parameters automatically and it has a superior inference process. We achieved a better classification result by utilizing these properties. Experimental results showed that an accuracy rate of more than 90% was achieved. Thus, the proposed system is capable of being a front-end for many application systems such as speech recognition and speaker identification to improve the performance of these application systems.en_US
dc.language.isoen_USen_US
dc.subject音頻信號分類zh_TW
dc.subject過零率zh_TW
dc.subject特徵抽取zh_TW
dc.subject類神經網路zh_TW
dc.subject音頻信號分析zh_TW
dc.subjectaudio signal classificationen_US
dc.subjectzero-crossing rateen_US
dc.subjectfeature extractionen_US
dc.subjectneural networksen_US
dc.subjectaudio signal analysisen_US
dc.title利用模糊類神經網路之音頻信號分類與切割技術zh_TW
dc.titleAudio Classification and Segmentation Technique Using Fuzzy Neural Networksen_US
dc.typeThesisen_US
dc.contributor.department電控工程研究所zh_TW
Appears in Collections:Thesis


Files in This Item:

  1. 255001.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.