Title: | 使用隱含知識技術壓縮多行深度學習網路 Multi-Column Deep Learning Network Compression Using Dark Knowledge Technique |
Authors: | 火神龍 Sousa Leite de Carvalho, Marcus Vinicius 林進燈 Lin, Chin-Teng 電機工程學系 |
Keywords: | 深度學習網路;Machine Learning;Dark Knowledge;Deep Learning |
Issue Date: | 2015 |
Abstract: | 深度知識技術提供深度神經網絡兩種情境:模型壓縮與specialist networks.
第一種模型壓縮的情況下,我們能夠使用更簡單的模型以及較少的參數來配合一個更大的模型之性能。如此簡化的模型可被應用在低記憶系統,如:衛星系統、智慧型手機以及嵌入系統,同時以更快速與更有效的方式達到複雜網絡的性能。
第二種specialist networks的情況下,不論其根本的複雜性,我們使用隱含知識技術來提升深度網絡模型的性能。藉由不同類別而非全面的網絡來訓練specialist networks,結合不同的specialist networks創造對整體網絡的效益。
本論文將著重在使用隱含知識技術來增進多行深度學習網絡、隱含知識的提取以及壓縮模型的應用,達到訓練試驗性能的提升。我們呈現不同的技術來增進深度學習網絡的訓練與準確度,應用委員會機器來提取更多效能並且最後將其知識轉移至小型網絡上。研究顯示隱含知識對深度多層感知器仍有其極重要之影響。 Dark Knowledge techniques provide two different context to Deep Neural Networks: Model Compression and Specialist Networks. In the first case, Model Compression, we are able to use a simpler model with fewer parameters to match the performance of a larger model. This simpler model can be used by low-memory system, such as satellites, smartphones or embedded systems, and still achieve complex network performance in a faster and cheaper way. In the later, Specialists Networks, we use the dark knowledge techniques to improve the performance of a deep network model regardless of its underlying complexity. As Specialists Networks are trained by using different classes instead of the full network, the ensemble method constructed by combining the various specialist networks creates benefits for the overall network. The present work focuses on Deep Learning techniques to enhance performance of a Multi-Column Deep Learning Network, extraction of its dark knowledge and appliance in a compressed model in order to improve the train and test performance of the latter. We showed techniques to improve the training and accuracy of a Deep Learning Network, applied a committee machine to extract more power from it and finally transferred its knowledge to small networks, suggesting that dark knowledge still have a huge impact on Deep Multi-Layer Perceptrons studies. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070150735 http://hdl.handle.net/11536/125576 |
Appears in Collections: | Thesis |