標題: | JPEG2000 最佳化區塊編碼器 IP 設計 Efficient Design of JPEG2000 EBCOT TIER-I Context Formation Encoder |
作者: | 張其勤 Chi-Chin Chang 陳紹基 Sau-Gee Chen 電機學院電子與光電學程 |
關鍵字: | 方塊編碼;區塊編碼;內文型塑;小波轉換;JPEG2000;EBCOT;CF;DWT;Context Formation;Pass-Parallel |
公開日期: | 2005 |
摘要: | JPEG2000 是一種新的靜態影像壓縮規格,其中最吸引人的莫過於它在提高影像品質與降低位元壓縮率方面的優異表現。但相對的它也因此需要更高的運算量與更多的硬體資源配合,其中又以EBCOT為最。目前有許多方法被提出來改善這些缺憾,而 Pass-Parallel 架構,則是其中最有效率的方法之一。
在本論文中,我們對Pass-Parallel EBCOT架構裡的context formation (CF),提出一些改良方法,來加快運算效能、減少硬體電路面積並提昇硬體利用效率。其中Sample-Parallel Pass-Type Determine (SPPD) method 可以縮短判斷同一行裡四個samples 各是屬於何種 coding pass (Pass type) 的時間,Column-Based Pass-Parallel Coding (CBPC) method 則同時將同一行裡的四個samples 一起編碼。
我們設計了一個CF硬體編碼電路,來驗證這兩種新方法。我們把判斷同一行四個samples的 Pass type,與編碼同一行的四個samples,分成兩個步驟來處理,並個別最佳化每一個步驟:在第一個步驟使用 SPPD 來縮短判斷 pass type 的時間,如此可加快運算效率。第二個步驟使用 CBPC來同時編碼同一行裡的四個 samples,如此可減少編碼所需的硬體電路,並改善硬體利用率。
我們的設計經過Synopsys® Design Compiler以TSMC CMOS 0.15μm製程合成後,pre-layout晶片面積大小為18127.31 μm2,在 WCCOM worst_case_tree 的環境下工作頻率最快可以到達600 MHz,處理一張2304 × 1728的灰階影像時,編碼時間為0.0116秒。這兩種方法可以將Pass-Parallel CF編碼所需的時間再減少13.83%,將所需的硬體電路面積再減少18.28%,並提高硬體利用率34.78%。 JPEG2000 is a new still image compression standard. The most attractive feature of this new standard is that it can reduce the bit rate significantly while the image quality is also preserved. However, this feature requires more complex computations and hardware cost in comparison to other standards. Moreover, most of the computation time is in EBCOT. There are many design techniques have been proposed for its efficient realization. The Pass-Parallel architecture is one of the most efficient methods. In this thesis, we propose some methods to improve the computation efficiency, hardware utilization, and reduce hardware area for the Pass-Parallel EBCOT context formation (CF) engine. The Sample-Parallel Pass-Type Detection (SPPD) method is proposed to improve the performance in deciding the pass types of all four samples in the same column. The Column-Based Pass-Parallel Coding (CBPC) method is proposed to code all four samples in the same column concurrently. We design a CF encoder to verify both new methods. We use two steps to process the input samples to CF and optimize each steps. In step one we use SPPD to shorten the time for determining pass types, and thus improve the whole computation performance. In step two we use CBPC to code all four samples in the same column according to the pass types determined in step one, and thus reduce the hardware cost and improve the hardware utilization. Our design is synthesized by Synopsys® Design Compiler using TSMC CMOS 0.15μm process. The pre-layout synthesized area is 18127.31 μm2. In our simulation, the operation clock frequency can be up to 600 MHz in the WCCOM worst_case_tree environment. With this clock frequency, it needs 0.0116 second to encode an image with 2304 x 1728 image size. Both proposed methods can reduce 13.83% of the encoding time, 18.28% of the hardware cost, and 34.78% of the hardware utilization, in comparison to the original Pass-Parallel CF. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009167528 http://hdl.handle.net/11536/63524 |
Appears in Collections: | Thesis |