标题: 卷积式类神经网路处理器之设计与实现
Design and Implementation of Convolutional Neural Network Processor
作者: 俞人暄
李镇宜
Yu, Ren-Xuan
Lee, Chen-Yi
电子研究所
关键字: 卷积式类神经网路;硬体加速;机器学习;Convolutional Neural Network;Hardware Acceleration;Machine Learning
公开日期: 2016
摘要: 卷积式类神经网路是一种在传统类神经网路基础上发展出的机器学习模型,近年来,由于其高精准度、较少参数的特性而被广泛应用于各种智慧系统和物联网应用场景之中。然而,即使是非常简单的卷积式类神经网路架构,其中的运算量也十分庞大,并且其内部运算会导致硬体资源利用率随架构深度下降。同时,为了满足不同应用的需求,其架构需要针对具体应用来调整。因此,在本篇论文中,我们设计并实现了一个可以弹性支援不同架构的卷积式类神经网路处理器,并藉由论文中所提出的重复利用计算单元的方法,有效的提升了硬体的利用率和处理速度。我们设计的系统通过Xilinx Virtex-7系列之现场可程式化闸阵列整合,并达到了4.799 e+9 synapses/s 和 3.96 nJ/synapse之运算效能。
Convolutional neural network is a machine learning model with higher accuracy and less parameters than the traditional neural network, and it is widely use in the smart systems and IoT scenarios. However, the large amount of complex computation limits the processing speed, and some of the internal operations will even cause the decrease of utilization of processing unit. Moreover, different CNN models are required for various applications. Therefore, we propose and design a flexible CNN processor with high hardware utilization that can support different CNN models efficiently in this dissertation. The system is integrated on the Xilinx Virtex-7 FPGA, and achieves 4.799 e+9 synapses/s throughput and 3.96 nJ/synapse energy efficiency.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070350296
http://hdl.handle.net/11536/139953
显示于类别:Thesis