标题: MPEG-4 AAC与MP3的听觉感官模型设计
Design of Psychoacoustic Model for MPEG-4 Advanced Audio Coding and MPEG Layer III
作者: 邱挺
Ting Chiou
刘启民
李文杰
资讯科学与工程研究所
关键字: 听觉感官模型;psychoacoustic model;fishy noise;filterbank;harmonic-rich signals;energy floor
公开日期: 2003
摘要: 本论文提出一个有效率的听觉感官模型,以filterbank取代FFT的计算,使其可以大量减少计算时间,并且在本文内提出了频率上的attack侦测。本文更利用energy floor来估测量化误差以及提出一个有效率的energy floor估算方式,而能有效的减少fishy noise达到良好的压缩品质。最后将此听觉感官模型实作在NCTU-AAC及NCTU-MP3上,效率上均获卓越的提升比传统的听觉感官模型达到60%以上的改进。并且使用ODG评断品质,而本论文的方法均能获得0.2的进步在MPEG12 bitstream。
This thesis presents an efficient psychoacoustic model providing better quality than the psychoacoustic model II. This thesis considers the design of the psychoacoustic models from two aspects. First, we improve the psychoacoustic model from the aspect of varying tonal and noise masking offset with bands and energy normalization to suppress the distortion, which is called the fishy noise or the birdie noise, caused by the overestimated masking in the harmonic-rich signals. Second, we consider the design issue in implementing the psychoacoustic model in the filterbank used in MP3 and AAC instead of the independent FFT to reduce the computing complexity and storage. The efficient psychoacoustic model provides 60 percentage performance gain compared to the psychoacoustic model II in MPEG-2/4 AAC and MP3. For the quality comparison based on Objective Difference Grade (ODG) and the subjective test, the efficient psychoacoustic model provides quality gain of 0.26 at 128k bit rates and 0.3 at 112k bit rate for MPEG testing bitstream in NCTU-AAC.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009117516
http://hdl.handle.net/11536/49569
显示于类别:Thesis


文件中的档案:

  1. 751601.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.