標題: 一個雜訊遮罩比最佳化的音訊位元率轉碼技術
NMR Optimized Bitrate Transcoding for MPEG-2/4 AAC with LC Profile
作者: 賴德宣
Te Hsueh Lai
蔣迪豪
Tihao Chiang
電子研究所
關鍵字: 雜訊遮罩比;位元率適應;音訊轉碼器;聽覺遮罩;位元率失真量最佳化;MPEG AAC LC;noise-to-masking ratio (NMR);bitrate adaptation;transcoder;masking threshold;rate-distortion optimization (RDO);MPEG AAC LC;fast transcoding
公開日期: 2005
摘要: 多媒體應用例如音樂隨選、數位廣播等愈來愈普及。為了在異質性網路中傳送多媒體內容到多樣性客戶端,即時音訊串流技術被廣泛地應用。為了適應不同的網路狀況與客戶端的設備和功能,音訊串流技術採用位元率的轉換技術。本論文提出一個可以應用於MPEG-2/4 AAC-LC的標準位元流之快速且雜訊遮罩比最佳化的位元率轉碼技術(Fast Noise-to-Masking Ratio optimized bitrate transcoding, FRDOT)。對於每一個設定的位元率,FRDOT找出每個頻帶最佳的量化參數,以達成每個頻帶內的雜訊遮罩比(Noise-to-Masking Ratio, NMR)之最佳化。並且,基於音訊位元流在位元轉碼前後具有相同聽覺遮罩 (Masking thresholds)的原理,轉碼技術最佳化的準則可以從雜訊遮罩比推演到雜訊訊號比(Noise-to-Signal Ratio, NSR)。基於雜訊訊號比,為了加速最佳化的位元率轉碼技術,我們採用表格查詢的方法以減少總運算量。為了進一步加速轉碼器,FRDOT採用頻寬限制器以減去在編碼區塊中最佳量化參數的遞迴式搜尋法所需的時間。並且,FRDOT提出位元流控制模組,使得轉碼器的輸出位元率更接近目標位元率。實驗結果顯示,本論文所提出的位元率轉碼器可將音訊位元流從高位元率轉換至較低位元率,並且相較於串接轉碼器有半分貝到三分貝的雜訊遮罩比改善。在執行時間方面,則有五到八倍的加速。
Real-time audio streaming services like music-on-demand (MOD), digital audio broadcasting (DAB), etc, deliver multimedia content over heterogeneous networks and to client devices with varying capabilities. To fit the network conditions and the clients’ capabilities, the bitrate adaptation based on the transcoding techniques is applied. We present a noise-to-masking-ratio (NMR) optimized MPEG-2/4 AAC LC transcoder, which is called as Fast Rate-Distortion Optimized Transcoder (FRDOT). In addition, FRDOT searches for the optimal scalefactor under the NMR criterion at a given bitrate. The computation of NMR difference is replaced by the derivation of signal-to-noise-ratio (SNR) difference since the audible masking thresholds of the input and output bitstreams are identical before and after transcoding. Within FRDOT transcoder, the SNR value is further converted to a noise-to-signal-ratio (NSR) to represent the distortion energy of audio signals. Therefore, the NMR optimized transcoding can be converted to the NSR optimized transcoding. The NSR optimized transcoding can find the optimal scalefactor increment according to the magnitudes of quantized input coefficients and the target bitrate. To speed up the search of optimal scalefactor increment, a table lookup technique is used. To further reduce the execution time, the bandwidth limiter is adopted to remove the iterative rate-distortion optimization of a frame. In addition, a bitrate control module is proposed to make the averaged bitrate of output bitstream close to the target bitrate. The experiment results show that the NMR value of FRDOT is better than the NMR value of cascaded transcoder (CT) by 0.5-3.0 dB at different bitrates and FRDOT can speed up CT by 5-8 times on the average.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009311609
http://hdl.handle.net/11536/78078
Appears in Collections:Thesis


Files in This Item:

  1. 160901.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.