卷積式類神經網路處理器之設計與實現

標題:	卷積式類神經網路處理器之設計與實現 Design and Implementation of Convolutional Neural Network Processor
作者:	俞人暄李鎮宜 Yu, Ren-Xuan Lee, Chen-Yi 電子研究所
關鍵字:	卷積式類神經網路;硬體加速;機器學習;Convolutional Neural Network;Hardware Acceleration;Machine Learning
公開日期:	2016
摘要:	卷積式類神經網路是一種在傳統類神經網路基礎上發展出的機器學習模型，近年來，由於其高精準度、較少參數的特性而被廣泛應用於各種智慧系統和物聯網應用場景之中。然而，即使是非常簡單的卷積式類神經網路架構，其中的運算量也十分龐大，並且其內部運算會導致硬體資源利用率隨架構深度下降。同時，為了滿足不同應用的需求，其架構需要針對具體應用來調整。因此，在本篇論文中，我們設計並實現了一個可以彈性支援不同架構的卷積式類神經網路處理器，並藉由論文中所提出的重複利用計算單元的方法，有效的提升了硬體的利用率和處理速度。我們設計的系統通過Xilinx Virtex-7系列之現場可程式化閘陣列整合，並達到了4.799 e+9 synapses/s 和 3.96 nJ/synapse之運算效能。 Convolutional neural network is a machine learning model with higher accuracy and less parameters than the traditional neural network, and it is widely use in the smart systems and IoT scenarios. However, the large amount of complex computation limits the processing speed, and some of the internal operations will even cause the decrease of utilization of processing unit. Moreover, different CNN models are required for various applications. Therefore, we propose and design a flexible CNN processor with high hardware utilization that can support different CNN models efficiently in this dissertation. The system is integrated on the Xilinx Virtex-7 FPGA, and achieves 4.799 e+9 synapses/s throughput and 3.96 nJ/synapse energy efficiency.
URI:	http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070350296 http://hdl.handle.net/11536/139953
顯示於類別：	畢業論文