Title: 應用於卷積類神經網路之具能源效益加速器及資料處理流程
Energy-Efficient Accelerator and Data Processing Flow for Convolutional Neural Network
Authors: 陳致強
黃威
Chen, Chih-Chiang
Hwang, Wei
電子研究所
Keywords: 卷積類神經網路;加速器;資料處理流程;Convolutional Neural Network;Accelerator;Data processing flow
Issue Date: 2017
Abstract: 近年來,機器學習以及卷積類神經網路(CNN)已經成為了這世代最熱門的研究主題。以前限制於硬體技術尚未發展完全,這研究主題在上世代並不獲得太大的重視。隨著這幾年的硬體技術的蓬勃發展,一併帶起了研究熱潮。由於卷積類神經網路需要相當大的運算以及膨大的資料存取及移動量,其所消耗的能量甚至可能比本身的運算還來的大,因此如何有效的運用資料再利用以及減少資料存取量成為一大研究課題。在這篇論文當中,我們提出了一個運算單元(Processing Element)的架構可以有效的使資料再利用(data reuse)而不必再從外部記憶體來存取;同時,我們也提出了一個資料運算流程,使得資料能夠在各運算單元當中傳遞,達到資料再利用的效果。並且,我們更提出了一個有別於傳統做法的三維(3D)以及二維半(2.5D)的加速器系統架構。利用三維架構本身的矽導孔(TSV)技術來更加減少資料傳遞時所造成的能源消耗。我們也比較出傳統2D,2.5D,以及3D的能量耗損、速度等,並做出一份表格比較。在論文的後章我們也提出了一個現場可程式邏輯門陣列(FPGA)的實現流程,使得日後在這研究課題上能有更佳的幫助。整體上,我們提出一創新的可重組加速器應用於深度學習網路,優於強化計算及資料的應用。此可重組運算硬體技術可減緩能量及記憶體的隔閡,並可應用於電腦視覺、卷積類神經網路、深度學習網路等方面。
For recent years, Machine learning and Convolutional Neural Network (CNN) has become the most popular research topic in this era. Restricted to the hardware technique that has not become mature, this topic is not being fully developed before. Since CNN needs a lot of calculation and a large amount of data access and movement, the energy cost on the data access may even exceed the computation consumption. Therefore, how to manage data reuse efficiently and reduce data access has turned into a research theme. In this thesis, we propose a Processing Element (PE) that makes data reuse effectively. Meanwhile, we propose a data processing flow. With that flow, data can be propagated between each PE, making data reuse more frequently. Besides, we propose a 3D/2.5D accelerator system architecture. Transmitting data with TSV can further decrease the energy consumption. We make a comparison table of 2D, 2.5D, and 3D in speed, power, etc. Also, we propose a FPGA implementation design flow for reference and the future research. We present a new innovative reconfigurable accelerator for deep learning networks which has the advantages of both computation-intensive and data-intensive applications. This new reconfigurable computing hardware technique can mitigate the power and memory walls for both computation- and data-intensive applications such as, computer vision, computer graphics, convolution neural networks and deep learning networks.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070450222
http://hdl.handle.net/11536/142448
Appears in Collections:Thesis