標題: 可調視訊編碼之高等細緻可調性研究
Scalable Video Coding - Advanced Fine Granularity Scalability
作者: 彭文孝
Wen-Hsiao Peng
蔣迪豪
李鎮宜
Tihao Chiang
Chen-Yi Lee
電子研究所
關鍵字: 可調視訊編碼;Scalable Video Coding
公開日期: 2005
摘要: 為了在異質而多樣的環境下進行視訊串流/廣播,MPEG-4定義了一個細緻可調視訊編碼。透過截斷位元流的方式,MPEG-4可調視訊編碼可以優雅的降低視訊品質。儘管目前的編碼演算法提供了很好的細緻可調性,可是現有的方法卻有較差的編碼效率和較差的主觀視覺效果。 本論文從預估器,位元層編碼,到傳送順序提供一個整合的方式來改進目前MPEG-4細緻可調視訊編碼。具體而言,我們提出了一個增強型適應性細緻可調視訊編碼和一個內文適應性位元層編碼來達到較高的編碼效率。更近一步的,從內文適應性位元層編碼架構,我們開發了一個估測性位元重排技術用以改善主觀的視覺效果。 為了提高編碼效率,我們的增強型適應性細緻可調視訊編碼同時藉由基本層和增強層來建構一組較佳的預估器。為了最小化可能的漂移錯誤,我們在編碼端利用一個多餘的預估回圈來模擬在解碼端的漂移行為。接著,在一個理論基礎的導引下,我們在不同的估測模式中切換,使得能在最小的漂移錯誤下達到最高的編碼效率。比起MPEG-4的方法,我們的增強型適應性細緻可調視訊編碼可以有超過1~1.5dB的PSNR效能改進。 除了建構較佳的預估器,我們也使用一個內文適應模式和位元依序方法來編碼增強層,進而改進編碼效率。為了充分利用已存在的相關性,內文參照模型的設計同時利用了單一轉換方塊中的能量分佈和相鄰轉換方塊中的空間相關性。同時,跨位元層的相關性也被用來減少附屬資訊。比起MPEG-4的位元層編碼,我們的內文適應性位元層編碼更進一步的達到0.5~1dB的PSNR改善。 透過以位元為單位的運作,估測性位元重排技術藉由一個能依據視訊內容來精化基本層的方式來改善主觀視覺效果。具體而言,我們重排在增強層的係數位元使得較多能量的區域被指定較高的精化優先權。特別的是,為了避免傳送實際的編碼順序,精化優先權由一個使用最大可能性原理所推導的模型來決定。比起使用固定的掃描方式,我們的估測性位元重排技術提供了較佳的主觀視覺效果,並且保有近似或更高的編碼效率。 總結,本論文證明了目前MPEG-4細緻可調視訊編碼的壓縮效能和主觀視覺效果可以被顯著改善。細緻可調視訊編碼和非可調編碼的效能差距可以被縮短。此外,所提的技術可以運用在未來的可調視訊編碼標準。
For the video streaming/broadcasting in a heterogeneous environment, MPEG-4 defines the fine granularity scalability (FGS), which offers graceful degradation of visual quality through the bit-stream truncation. While offering good scalability at fine granularity, current approach suffers from poor coding efficiency and subjective quality. This dissertation provides an integrated solution, from prediction, bit-plane coder, and coding order, to improve MPEG-4 FGS. Specifically, we propose an enhanced mode-adaptive FGS (EMFGS) algorithm and a context-adaptive bit-plane coder (CABIC) to deliver higher coding efficiency. Further, based on the CABIC framework, we develop a stochastic bit reshuffling (SBR) technique to achieve better subjective quality. For higher coding efficiency, our EMFGS constructs better enhancement-layer predictors from both the base layer and the enhancement layer. To minimize possible drifting errors, the EMFGS encoder employs a dummy prediction loop to simulate drifting behavior in the decoder. Then, under the guidance of a theoretic framework, the prediction is switched among different predictors such that the coding efficiency can be maximized with minimized drifting errors. Comparing with MPEG-4 FGS, our EMFGS offers a PSNR gain of 1~1.5dB. In addition to constructing better predictors, our CABIC also improves the coding efficiency by coding the enhancement layer in a context-adaptive and bit-by-bit manner. To fully utilize the existing correlations, the context models are designed based on both the energy distribution in a transform block and the spatial correlations in the adjacent blocks. Moreover, the context across bit-planes is exploited to save side information. Comparing with the bit-plane coding in MPEG-4 FGS, our CABIC scheme further achieves a PSNR gain of 0.5~1.0dB. Through the bit-wise operation, the SBR improves the subjective quality by refining the base layer in a content-aware manner. Specifically, the coefficient bits at the enhancement layer are reshuffled such that the regions containing more energy are assigned with higher priority for refinement. Particularly, to prevent the exact coding order from transmission, a model-based approach, derived from maximum likelihood principle, is used to decide the coding priority. Comparing with the approaches with deterministic coding order, our SBR provides better visual quality and maintains similar or even higher coding efficiency. In conclusion, this work proves that MPEG-4 FGS can be significantly improved in coding efficiency and subjective quality. The performance gap between the non-scalable codec and the scalable one can be reduced. Moreover, the proposed schemes can be applied in the upcoming standard for scalable video coding.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009111828
http://hdl.handle.net/11536/44457
Appears in Collections:Thesis


Files in This Item:

  1. 182801.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.