完整後設資料紀錄
DC 欄位語言
dc.contributor.author林哲懷zh_TW
dc.contributor.author賴伯承zh_TW
dc.contributor.authorLin, Che-Huaien_US
dc.contributor.authorLai, Bo-Chengen_US
dc.date.accessioned2018-01-24T07:38:04Z-
dc.date.available2018-01-24T07:38:04Z-
dc.date.issued2016en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070350249en_US
dc.identifier.urihttp://hdl.handle.net/11536/139492-
dc.description.abstract卷積類神經網路(CNNs)應用於複雜的機器學習作業上具有高準確度以及容忍輸入雜訊的能力,因此近年來十分受到矚目。卷積運算所需的龐大計算效能對軟體及硬體方面都造成了極大的挑戰,不論在展開平行度還是資料復用方面都必須經過精心設計才能獲得優越的效能。目前的設計並未針對各個設計技巧與決策做完整分析,也缺乏對平行化的維度等設計準則背後的原因的深入研究。本論文對卷積的程式設計技巧與其在GPU架構下的影響做了質化與量化的分析,解釋了各個設計技巧對效能的影響。根據我們的分析,卷積運算在GPGPU有兩大效能瓶頸,分別是計算與從記憶體讀資料。我們在論文中針對這兩個效能瓶頸提出了軟體與硬體的解決方法,在cycle accurate 的 GPGPU模擬器GPGPU-sim [7]上的實驗結果顯示效能與原先的參考設計相比增進了4.4倍。zh_TW
dc.description.abstractConvolutional Neural Networks (CNNs) have gained attention in recent years for their ability to perform complex machine learning tasks with high accuracy and resilient to noise in the inputs. The time-consuming convolution operations required by CNNs pose great challenges to both software as well as hardware designers. To achieve superior performance, a design involves careful concerns between exposing the massive computation parallelism and exploiting data reuse in complex data accesses. Existing designs lack comprehensive analysis on design techniques and decisions. The analytical discussion and quantitative proof behind the design criterion, such as choosing proper dimensions to parallelize, are not well studied. This thesis performs a series of qualitative and quantitative studies on both the programming techniques and their implications on the GPU architecture. The observations reveal comprehensive understanding on the correlation between the design techniques and the resulting performance. Based on the analyses, we pinpoint the two major performance bottlenecks of CNN on GPGPU: performing computation and loading data from global memory. Software and hardware enhancements are proposed in this thesis to alleviate these issues. Experimental results on a cycle-accurate GPGPU simulator, GPGPU-sim, have demonstrated up to 4.4x performance enhancement when compared with the reference design.en_US
dc.language.isoen_USen_US
dc.subject卷積類神經網路zh_TW
dc.subject卷積zh_TW
dc.subject通用圖形處理器zh_TW
dc.subjectCNNen_US
dc.subjectConvolutionen_US
dc.subjectGPGPUen_US
dc.title個案研究:現代通用圖形處理器之卷積類神經網路之軟體與硬體改進zh_TW
dc.titleA Case Study: Software and Hardware Enhancement of Convolutional Neural Networks on Modern GPGPUsen_US
dc.typeThesisen_US
dc.contributor.department電子研究所zh_TW
顯示於類別:畢業論文