標題: MPEG-4 AAC與MP3的聽覺感官模型設計
Design of Psychoacoustic Model for MPEG-4 Advanced Audio Coding and MPEG Layer III
作者: 邱挺
Ting Chiou
劉啟民
李文傑
資訊科學與工程研究所
關鍵字: 聽覺感官模型;psychoacoustic model;fishy noise;filterbank;harmonic-rich signals;energy floor
公開日期: 2003
摘要: 本論文提出一個有效率的聽覺感官模型,以filterbank取代FFT的計算,使其可以大量減少計算時間,並且在本文內提出了頻率上的attack偵測。本文更利用energy floor來估測量化誤差以及提出一個有效率的energy floor估算方式,而能有效的減少fishy noise達到良好的壓縮品質。最後將此聽覺感官模型實作在NCTU-AAC及NCTU-MP3上,效率上均獲卓越的提升比傳統的聽覺感官模型達到60%以上的改進。並且使用ODG評斷品質,而本論文的方法均能獲得0.2的進步在MPEG12 bitstream。
This thesis presents an efficient psychoacoustic model providing better quality than the psychoacoustic model II. This thesis considers the design of the psychoacoustic models from two aspects. First, we improve the psychoacoustic model from the aspect of varying tonal and noise masking offset with bands and energy normalization to suppress the distortion, which is called the fishy noise or the birdie noise, caused by the overestimated masking in the harmonic-rich signals. Second, we consider the design issue in implementing the psychoacoustic model in the filterbank used in MP3 and AAC instead of the independent FFT to reduce the computing complexity and storage. The efficient psychoacoustic model provides 60 percentage performance gain compared to the psychoacoustic model II in MPEG-2/4 AAC and MP3. For the quality comparison based on Objective Difference Grade (ODG) and the subjective test, the efficient psychoacoustic model provides quality gain of 0.26 at 128k bit rates and 0.3 at 112k bit rate for MPEG testing bitstream in NCTU-AAC.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009117516
http://hdl.handle.net/11536/49569
Appears in Collections:Thesis


Files in This Item:

  1. 751601.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.