CNN BASED TWO-STAGE MULTI-RESOLUTION END-TO-END MODEL FOR SINGING MELODY EXTRACTION

標題:	CNN BASED TWO-STAGE MULTI-RESOLUTION END-TO-END MODEL FOR SINGING MELODY EXTRACTION
作者:	Chen, Ming-Tso Li, Bo-Jun Chi, Tai-Shih 電機工程學系 Department of Electrical and Computer Engineering
關鍵字:	Melody extraction;multi-resolution;convolution neural network;end-to-end learning;music information retrieval
公開日期:	1-一月-2019
摘要:	Inspired by human hearing perception, we propose a two-stage multi-resolution end-to-end model for singing melody extraction in this paper. The convolutional neural network (CNN) is the core of the proposed model to generate multi-resolution representations. The 1-D and 2-D multi-resolution analysis on waveform and spectrogram-like graph are successively carried out by using 1-D and 2-D CNN kernels of different lengths and sizes. The 1-D CNNs with kernels of different lengths produce multi-resolution spectrogram-like graphs without suffering from the trade-off between spectral and temporal resolutions. The 2-D CNNs with kernels of different sizes extract features from spectro-temporal envelopes of different scales. Experiment results show the proposed model outperforms three compared systems in three out of five public databases.
URI:	http://hdl.handle.net/11536/152923
ISBN:	978-1-4799-8131-1
ISSN:	1520-6149
期刊:	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
起始頁:	1005
結束頁:	1009
顯示於類別：	會議論文