Spectrum restoration from multiscale auditory phase singularities by generalized projections

doi:10.1109/TSA.2005.860828

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Chi, Taishih	en_US
dc.contributor.author	Shamma, Shihab A.	en_US
dc.date.accessioned	2014-12-08T15:16:18Z	-
dc.date.available	2014-12-08T15:16:18Z	-
dc.date.issued	2006-07-01	en_US
dc.identifier.issn	1558-7916	en_US
dc.identifier.uri	http://dx.doi.org/10.1109/TSA.2005.860828	en_US
dc.identifier.uri	http://hdl.handle.net/11536/12092	-
dc.description.abstract	We examine the encoding of acoustic spectra by parameters derived from singularities found in their multiscale auditory representations. The multiscale representation is a wavelet transform of an auditory version of the spectrum, formulated based on findings of perceptual experiments and physiological research in the auditory cortex. The multiscale representation of a spectral pattern usually contains well-defined singularities in its phase function that reflect prominent features of the underlying spectrum such as its relative peak locations and amplitudes. Properties (locations and strength) of these singularities are examined and employed to reconstruct the original spectrum by using an iterative projection algorithm. Although the singularities form a nonconvex set, simulations demonstrate that a well-chosen initial pattern usually converges on a good approximation of the input spectrum. Perceptually intelligible speech can be resynthesized from the reconstructed auditory spectrograms, and hence these singularities can potentially serve as efficient features in speech compression. Besides, the singularities are very noise-robust which makes them useful features in various applications such as vowel recognition and speaker identification.	en_US
dc.language.iso	en_US	en_US
dc.subject	auditory model	en_US
dc.subject	convex projection	en_US
dc.subject	phase singularity	en_US
dc.subject	spectrum restoration	en_US
dc.title	Spectrum restoration from multiscale auditory phase singularities by generalized projections	en_US
dc.type	Article	en_US
dc.identifier.doi	10.1109/TSA.2005.860828	en_US
dc.identifier.journal	IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING	en_US
dc.citation.volume	14	en_US
dc.citation.issue	4	en_US
dc.citation.spage	1179	en_US
dc.citation.epage	1192	en_US
dc.contributor.department	電信工程研究所	zh_TW
dc.contributor.department	Institute of Communications Engineering	en_US
dc.identifier.wosnumber	WOS:000238709200010	-
dc.citation.woscount	0	-
顯示於類別：	期刊論文

文件中的檔案：

000238709200010.pdf

若為 zip 檔案，請下載檔案解壓縮後，用瀏覽器開啟資料夾中的 index.html 瀏覽全文。