標題: 以視覺化分析探討不均勻數據對卷積神經網路的影響
The influence of imbalanced data on convolutional neural network – a visualisation approach
作者: 許宗仁
孫春在
Hsu, Tsung-Jen
Sun, Chuen-Tsai
資訊科學與工程研究所
關鍵字: 卷積神經網路;不均勻數據;神經網路視覺化;Convolutional Neural Network;Imbalanced Data;Neural Network Visualisation
公開日期: 2017
摘要: 近年來,由於深度學習技術的興起以及其在不同領域(如電腦視覺、語音辨識等)的顯著突破,使得曾經沉寂一時的類神經網路再次受到重視。卷積神經網路(Convolutional Neural Network)即為深度學習中最先進的模型之一,近年來在電腦視覺領域中被廣為使用。與此同時,越來越多的深度學習演算法從學術走向工業,而在這個過程中我們面臨許多困難,不均勻數據雖稱不上最難的問題,但絕對是最重要的問題之一。在過去類神經網路被形容為一個「黑盒子」,在本研究中我們將以視覺化方法,打開這個「黑盒子」深入探討不均勻數據對卷積神經網路的影響。 本研究以自製實驗將數據集分為四個分布,將各分布分別餵入神經網路中訓練,並將訓練結果透過視覺化工具呈現各分布所學的特徵,以此實驗深入探討不均勻數據對於卷積神經網路內部所造成的不良影響。最後實驗結果顯示不同分布的不均勻數據會對卷積神經網路造成不同層面的不良影響。且本研究亦發現,因不均勻數據所造成效能低落的原因為較高卷積層無法學習到足以分辨效能低落類別的特徵,而效能低落與較低卷積層的關聯則較小。
In recent years, due to the rise of deep learning and its significant breakthrough in different fields (such as Computer Vision and Speech Recognition), Neural Network is now being emphasized once again. Convolutional Neural Network is one of the state-of-the-art models of Deep Learning, which has been wildly applied to the filed of computer vision in recent years. Meanwhile, more and more computer vision algorithms have been developed in the industry and during the process we have encountered lots of difficulties. Imbalanced data may not be the toughest problem, but it is definitely one of the most important issues. In the past, Neural Network was described as a ''black box'' and in this research, we are going to open the ''black box'' by visualisation to explore the affects of imbalanced data on Convolutional Neural Network. The research divides the dataset into four distributions by a self-designed experiment. Each distribution is fed respectively in Neural Network to be trained and via the tool of visualisation, the result presents the traits learned from each distribution. The experiment is to study the negative affects on Convolutional Neural Network resulting from imbalanced data. At last the result shows that different distribution of imbalanced data do have negative influence on Convolutional Neural Network. In addition, in the research we have also found the reason why imbalanced data results in lower efficacy. Higher convolutional layers are unable to fully learn the features that lead to low efficacy while lower convolutional layers have less to do with the consequence.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456155
http://hdl.handle.net/11536/142838
顯示於類別:畢業論文