Full metadata record
DC FieldValueLanguage
dc.contributor.author陳威年en_US
dc.contributor.authorWei-Nien Chenen_US
dc.contributor.author杭學鳴en_US
dc.contributor.authorHsueh-Ming Hangen_US
dc.date.accessioned2014-12-12T01:13:52Z-
dc.date.available2014-12-12T01:13:52Z-
dc.date.issued2007en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT009511607en_US
dc.identifier.urihttp://hdl.handle.net/11536/38134-
dc.description.abstract由於顯示處理器的快速發展,近年來漸漸發展出將顯示處理器應用於非圖形的運算,以輔助中央處理器,此技術通稱為GPGPU。美國NVIDIA公司在2007年提出一個全新的顯示處理器架構,其全名為「統一運算單元架構」,簡稱CUDA,為現今對運算能力要求極高的資料密集型應用程式提供了具彈性的大型平行運算平台。在本篇論文中,我們將H.264/AVC的編碼系統建立在此架構上。 我們針對H.264/AVC編碼器中最耗費運算能力的motion estimation以及intra prediction 模式選擇兩個部份作CUDA平台的實現。我們對於intra prediction模式選擇提出了block層級的平行化,並且提出使用原始影像作為預測參考的intra prediction演算法。此外,為了要能完全的利用CUDA的處理能力,我們對於執行緒的分配使用與記憶體的配置做了最佳化,並且以此基礎設計了一套五個步驟的 motion estimation流程。我們在NVIDIA GeForce 8800GTX GPU平台上驗證我們的演算法,對於個別的模組達到了約12倍的加速,而整體H.264/AVC編碼器也有大約5倍的加速。zh_TW
dc.description.abstractDue to the rapid growth of graphics processing unit (GPU) processing capability, using GPU as a coprocessor to assist the central processing unit (CPU) in computing massive data becomes essential. NVIDIA announced a powerful GPU architecture called Compute Unified Device Architecture (CUDA) in 2007. This new architecture largely improves the programming flexibility of general-purpose GPU. In this thesis, we propose a highly parallel intra mode selection scheme and a full search motion estimation scheme with fractional pixel refinement optimized for the CUDA architecture. In order to achieve the block-level parallelized intra mode selection, the original pixel values rather than the coded pixels are used for deciding the best intra-prediction mode. In addition, to fully utilize the computation power of CUDA, the thread usage and memory access pattern are carefully tuned. Following the parallel processing optimization rules, we design a motion estimation algorithm consisting of 5 stages. We try to process as many data as possible to fully use the computing power of this GPU. The proposed algorithms are evaluated on the NVIDIA GeForce 8800GTX GPU platform. The speed up ratios of these two modules are about 12 times faster, and the overall H.264/AVC encoding time is about 5 times faster than the PC only counterpart.en_US
dc.language.isoen_USen_US
dc.subject顯示處理器zh_TW
dc.subject統一運算單元架構zh_TW
dc.subject先進視訊編碼zh_TW
dc.subject內框模式估測zh_TW
dc.subject動作估測zh_TW
dc.subjectGPUen_US
dc.subjectCUDAen_US
dc.subjectH.264/AVCen_US
dc.subjectintra predictionen_US
dc.subjectmotion estimationen_US
dc.titleH.264/AVC視訊編碼器於NVIDIA CUDA之平行演算與實現zh_TW
dc.titleH.264/AVC Encoder Parallelized Realization on NVIDIA CUDAen_US
dc.typeThesisen_US
dc.contributor.department電子研究所zh_TW
Appears in Collections:Thesis