利用模糊類神經網路之音頻信號分類與切割技術

Full metadata record

DC Field	Value	Language
dc.contributor.author	陳瑞正	en_US
dc.contributor.author	Jui-Cheng Chen	en_US
dc.contributor.author	林進燈	en_US
dc.contributor.author	Chin-Teng Lin	en_US
dc.date.accessioned	2014-12-12T02:27:39Z	-
dc.date.available	2014-12-12T02:27:39Z	-
dc.date.issued	2004	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT009212550	en_US
dc.identifier.uri	http://hdl.handle.net/11536/68457	-
dc.description.abstract	在本論文中我們提出了一個針對音頻信號之分類與切割的系統，此系統可將含有靜音、純語音、純音樂以及歌曲之檔案，根據其類型加以分類與切割。我們針對上述各種音訊的特徵的作分析與比較，並根據這些分析與比較的結果，設計一套分類流程將輸入的音訊分兩階段依序完成分類與切割。一開始的靜音偵測根據一個門檻值標示出音訊中屬於靜音的部分。之後，第一階段將輸入音訊中非靜音部分分為純語音與「含有音樂成分」兩類，第二階段將在第一階段中被歸類為「含有音樂成分」的部分，進一步分為純音樂以及歌曲。為了解決傳統特徵在進行純音樂與歌曲分類時分類效果不佳的問題，本論文提出了一個名為「前三峰值之頻率變化量(FVTP)」的新特徵。此特徵描述了歌曲的頻譜結構會隨著時間而顯著地改變而純音樂之頻譜結構改變量相對較小之特性。因此該特徵能在進行純音樂與歌曲分類時，改善分類效果不佳的問題。而在分類器的選用方面，本系統採用一前向式自我建構類神經模糊推理網路(SONFIN)做為核心分類器。該網路具有可自我建構並調整的架構與參數學習的功能，以及優異的模糊類神經推論過程。我們利用這些特性達到較佳之分類結果。實驗結果顯示，本系統可達到平均90%以上的分類正確率。因此，本系統可作為許多如語音辨識、語者辨識等應用系統的前端處理，使輸入這些應用系統的內容符合系統要求以提升應用系統的效能。	zh_TW
dc.description.abstract	In this thesis, we proposed an audio classification and segmentation system. The system is used to classify and segment audio files which contain silence, pure speech, pure music, and song according to their contents. We analyzed and compared features of audio signals and designed a two-stage classification flow to classify and segment input audio signals sequentially. The flow starts with the silence detection which indexes silence according to a threshold. Then, stage 1 classifies the nonsilence parts into pure speech and “with music components”. Stage 2 classifies the “with music components” parts in stage 1 into pure music and song. In order to solve the problem that traditional features do not work well when it comes to pure music/song classification, we proposed a novel feature named FVTP. The feature describes the property that variations of the spectrum structure are larger for song but smaller for pure music. Thus, the feature can improve the performance of pure music/song classification. On the other hand, an on-line self-constructing neural fuzzy inference network (SONFIN) was adopted as the main classifier in this system. The SONFIN finds its optimal structure as well as parameters automatically and it has a superior inference process. We achieved a better classification result by utilizing these properties. Experimental results showed that an accuracy rate of more than 90% was achieved. Thus, the proposed system is capable of being a front-end for many application systems such as speech recognition and speaker identification to improve the performance of these application systems.	en_US
dc.language.iso	en_US	en_US
dc.subject	音頻信號分類	zh_TW
dc.subject	過零率	zh_TW
dc.subject	特徵抽取	zh_TW
dc.subject	類神經網路	zh_TW
dc.subject	音頻信號分析	zh_TW
dc.subject	audio signal classification	en_US
dc.subject	zero-crossing rate	en_US
dc.subject	feature extraction	en_US
dc.subject	neural networks	en_US
dc.subject	audio signal analysis	en_US
dc.title	利用模糊類神經網路之音頻信號分類與切割技術	zh_TW
dc.title	Audio Classification and Segmentation Technique Using Fuzzy Neural Networks	en_US
dc.type	Thesis	en_US
dc.contributor.department	電控工程研究所	zh_TW
Appears in Collections:	Thesis

Files in This Item:

255001.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.