標題: 基於訊號處理技術以及動態時間校正演算法之殭屍網路偵測系統
A Botnet Detection System Based on Signal Processing Technique and Dynamic Time Warping Algorithm
作者: 胡喬峰
曾文貴
Hu, Chiao-Feng
Tzeng, Wen-Guey
網路工程研究所
關鍵字: 殭屍網路偵測;訊號處理;離散傅立葉轉換;動態時間校正;K-means分群法;botnet detection;signal processing;discrete Fourier transform;dynamic time warping;K-means clustering
公開日期: 2017
摘要: 隨著現代網路迅速的發展,網路安全已經成為一個非常重要的議題。殭屍網路(Botnet)對網路世界所造成的威脅已不容小覷。因此,近十年已經有許多的研究在進行殭屍網路的偵測,然而,這些研究所使用的偵測方法大多是以網路流(flow)中的封包大小、或網路流的持續時間等特徵來辨識一條網路流是否為殭屍網路的C&C溝通。這樣的偵測方式很有可能會被攻擊者透過變換使用的端口、通訊協定,或改變傳送的封包大小等方式輕易的規避。
因此,本研究提出一個系統,以會話(conversation)的方式整合網路中的封包,由於會話是以2-tuple的方式定義,將不會被使用的埠或是通訊協定影響。本系統除了採用以往在殭屍網路偵測中十分有用的11個特徵屬性之外,還另外以離散傅立葉轉換(Discrete Fourier Transform)的技術計算了6個新的且和規律性有關的特徵屬性,以頻域的角度來檢視一個網路會話。最後,我們還加上了動態時間校正(Dynamic Time Warping)的技術,將殭屍網路會話中的封包特性視為時間序列,計算額外3K個DTW特徵屬性。以本系統提出的新的6+3K個特徵屬性,將可以對以往常用的11個特徵屬性在偵測殭屍網路的準確率上有所提升。
實驗的部分,我們將進行系統性的實驗,將資料集分為訓練資料集以及測試資料集,並對5個不同的特徵子集合分別加上6個DFT-features以及3K個DTW-features,這5個不同的特徵集合在加上6個DFT-features後,準確率全部都超過了96.61%,最後,再額外加上3K個DTW-features後,5個特徵集合的準確率全部都超過了98%,其中,F5為準確率上升最多的特徵集合,從原先的94.19%上升至98.49%。
With the rapid development of network technology, network security has become a very important issue. Botnet has posed a great threat to cybersecurity in recent years. Therefore, there are a lot of botnet detection studies in decade. However, many of these studies rely on the packet size in a flow or the duration of a flow as features to distinguish whether a flow is a C&C communication of botnet. The attacker may easily evade these flow-based detection methods by changing the port, protocols or even the packet size.
Hence, in this paper, we propose a conversation-based botnet detection system which use signal processing techniques and dynamic time warping algorithm. In the system, the packets will be aggregated into several conversations according to the source IP address and destination IP address. In this way, the port number and protocol will not affect. Besides, we calculate 6 new features based on Discrete Fourier Transform to view a conversation in the frequency domain. Finally, another 3K new features are calculated by using dynamic time warping algorithm. With these 6+3K features, we can improve the accuracy of which use the commonly used features in the past.
URI: http://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070456530
http://hdl.handle.net/11536/141370
Appears in Collections:Thesis