標題: | 適用於H.264/MPEG-AVC可調式視訊編碼之快速幀內預測演算法與設計 Fast intra prediction algorithm and design for H.264/MPEG-4 AVC scalable extension |
作者: | 溫孟勳 Wen, Meng-Hsun 張添烜 Chang, Tian-Sheuan 電子研究所 |
關鍵字: | svc;可調性視訊編碼;框內編碼;影像壓縮;H.264;MPEG-4;svc;Scalable video coding;intra prediction;H.264;MPEG-4 |
公開日期: | 2011 |
摘要: | 可調性視訊編碼提供高壓縮效率,並且能在一次編碼時,結合多重畫面大小、多重影像視覺品質以及多重影像播放速率,再根據不同的應用,依照系統頻寬以及終端處理能力,從相同的資料流擷取出需要的部分重建,然而,這也大幅增加設計上的複雜度,特別是針對即時性高畫質影像壓縮。
為了解決此問題,本論文提出了一種兩步驟的快速框內編碼預測演算法以及它的硬體設計,第一步先將畫面經由整數離散餘弦轉換與哈達瑪轉換,得到各個頻率值,並藉此判斷畫面平坦程度來決定畫面切割區塊大小,第二步是重複使用第一步的整數離散餘弦轉換與哈達瑪轉換來預測畫面的紋路方向,加以決定4x4候選模式與16x16候選模式,並利用4x4候選模式推測出8x8候選模式。實驗結果顯示,與 JM的全搜索法相比,我們的做法可以節省80%以上的候選模式,並且維持相異不大的壓縮質量(平均BD-PSNR差異:CIF為-0.01dB,1080p為+0.12dB;平均BD-Rate差異:CIF為+0.01%,1080p為- 3.13%)。另外在硬體實作上,不同的區塊大小框內預測可由共用運算單元計算,並且以交錯的方式同時處理兩個巨圖塊,可避免資料相依所導致的計算時間浪費。使用90奈米 CMOS製程,則等效面積為148k gate counts,以及需要143k-bits 的SRAM buffer,在135MHz頻率下,此編碼器可支援到的三層畫質,三層影像大小(CIF,SD 480和HD1080P)和高達每秒60張畫面的壓縮速率。 H.264/MPEG-4 AVC scalable extension can provides high compression efficiency and high visual quality with spatial, temporal, quality scalabilities for diverging decoding terminals. However, this also significantly increases the design complexity, particularly for real time high definition video encoding. This thesis proposes a fast two step intra prediction algorithm and its hardware design to meet real time demands. The first step is the intra block size decision by distinguishing the block smoothness through selected AC coefficients. The second step is a transform based mode candidates for 4x4 block and merged 4x4 mode candidates for 8x8 block. The experimental results shows that we can save more than 80% candidate modes, with similar quality (average BD-PSNR difference: -0.01 dB for CIF, +0.12dB for 1080p, average BD-Rate difference: +0.01% for CIF, -3.13% for 1080p), when compared with JM full search method. The resulted hardware design shares single prediction units for different block size computation and interleaved computes two macroblocks to avoid data dependency with embedded quality layer processing in the reconstruction loop. The implementation with 90nm CMOS process costs 148k gate counts, and 17.9k bytes SRAM buffer under 135MHz operating frequency to support processing rate of three quality layers, three spatial layers (CIF, SD 480p and HD 1080p) and up to 60 frames per second. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079811621 http://hdl.handle.net/11536/46786 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.