標題: 應用於SVC視訊編碼標準之空間可適性內幀解碼器設計
Design of An Intra Predictor with Spatial Scalability for Scalable Video Decoding
作者: 賴昱帆
Lai, Yu-Fan
李鎮宜
Lee, Chen-Yi
電子研究所
關鍵字: 空間可適性;解碼器;內幀;High Profile;可適性解碼器;H.264;Spatial Scalability;decoder;Intra;High Profile;SVC;H.264
公開日期: 2008
摘要: 基於H.264視訊標準下之可適性視訊壓縮標準(SVC),是新一代的視訊壓縮規格。與之前的標準相比,在同一種profile下,SVC支援了更多的可適性演算法來大大提升視訊的壓縮率。在空間可適性方面,SVC跟隨著以往多層的演算架構。向量移動補償和內幀預測同樣存在在每一層空間層裡面,就像以往單層的情況一樣。但是,相比較於每一層都是獨立的simulcast來說,為了要更加提高壓縮的效率,因此在SVC中引進了一種層與層之間的預測,這種新加入的方法就叫做層間預測 (Inter-layer Prediction)。 在SVC的架構當中,特別的是基層(Base Layer)要能夠與傳統H.264的標準相符合,並且要能夠解出正確的資料。因此,一個完整的SVC解碼器不但是要能夠解碼出原來H.264的位元流,同時也要能夠解碼出SVC格式下的位元流。基於上述,我們提出了一個用在high profile SVC解碼器上的內幀預測架構,而這個架構主要是由兩部分組成:基本內幀預測以及Intra_BL內幀預測。基本內幀預測是用來處理傳統H.264的內幀區塊,在High profile的規格下,它支援了macroblock-adaptive frame field (MBAFF)的視訊格式,為了減少暫存器的使用量,我們透過重覆使用上方、左方以及角落的暫存器,來優化對圖素的存取,必且提升了存取上面的使用效率。另一個在High profile下支援的演算法為Intra_8x8,我們利用提出的base-mode預測器來簡化RSFP (Reference Sample Filtering Process)的過程,同時也優化了暫存器的使用量以及處理的時間。 另一方面,對於處理在SVC中提供的新內幀預測模式Intra_BL,我們也提出了另一個預測模組,叫做Intra_BL內幀預測模組。這個Intra_BL內幀預測模組是由banked SRAM、橫向的基本插補運算單元、縱向的基本插補運算單元以及縱向的延伸式插補運算單元所組成。對於這些插補運算單元,我們也優化了其對面積上面的設計。基於這個初步設計的Intra_BL內幀預測模組,我們更進一步提出了對於低功率消耗上的改良式低功耗Intra_BL內幀預測模組設計。主要的改進分為兩部分:第一部分為在原來記憶體階級中,我們加入了第二級暫存器的設計;第二部份為我們利用相等特性來修改了基本插補運算單元的處理流程。加入了這兩種方法,我們可以節省全部功率消耗的46.43%。 最後,我們利用90奈米製程技術實作了整個SVC內幀預測架構,跑在頻率為145 MHz情況下,總面積為42756 NAND2 CMOS gates。另外在功率損耗上面,分別跑在頻率為100 MHz 的H.264規格下以及頻率為145 MHz的SVC規格下,功率消耗為0.292 mW以及2.86 mW。¬而這個設計可以在頻率為100 MHz 的H.264視訊標準下達到每秒30張HD1080的畫面大小,並且可以在頻率為145 MHz 的SVC視訊標準下最大支援到每秒30張HD720和HD1080兩層空間層的解碼。
Scalable Video Coding (SVC) extension of the H.264/AVC is the latest standard in video coding. It has achieved significant improvements in coding efficiency with an increased degree of supported scalability relative to the scalable profiles of prior video coding standards. For supporting spatial scalable coding, SVC follows the conventional approach of multilayer coding. In each spatial layer, motion-compensated prediction and intra-prediction are employed as for single-layer coding. But in order to improve coding efficiency in comparison to simulcasting different spatial resolutions, additional so-called inter-layer prediction mechanisms are incorporated. In particular, H.264/AVC compatible bitstream needs to be decoded in the base layer of SVC. Therefore, a SVC decoder must support both traditional H.264 decoding and SVC extension decoding. Specifically, we propose a high profile SVC intra prediction engine which is composed of two major prediction modules, basic prediction module and Intra_BL prediction module. Basic prediction module is used to decode the traditional H.264 intra prediction. In order to reduce the buffer size for supporting macroblock-adaptive frame field (MBAFF) coding which is supported in high profile, we optimize the buffer size via upper, left, and corner data reuse sets (DRS) to reuse the pixels and improve the cost and access efficiency. In Luma_8x8 decoding process, we simplify the RSF process via a base-mode predictor and optimize the processing latency and buffer cost. For the Intra_BL prediction module which is used to decode the new intra prediction type called “Intra_BL”, we propose an Intra_BL prediction engine that consists of banked SRAM, basic horizontal interpolator, basic vertical interpolator and extended vertical interpolator. We also optimize the architecture of interpolators to have better area efficiency than direct implementation. Based on our preliminary Intra_BL prediction module design, we further propose a power efficient Intra_BL prediction module. By applying a second stage of register sets in memory hierarchy and equality determination before basic interpolation process, a total of 46.43% power consumption can be reduced. Finally, the architecture of this power efficient SVC intra prediction engine is implemented in a 90nm technology with a total area of 42756 NAND2 CMOS gates under working frequency of 145 MHz. The power consumption is 0.292 mW and 2.86 mW under frequency of 100 MHz and 145 MHz for H.264 and SVC, respectively. This design can achieve real-time processing requirement for HD1080 format video in 30fps under the working frequency of 100MHz in H.264, and for a maximum two spatial layers with HD720 and HD1080 scalable format video in 30fps under the working frequency of 145MHz in SVC.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079611595
http://hdl.handle.net/11536/41722
Appears in Collections:Thesis


Files in This Item:

  1. 159501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.