萃取、分類及匿名封包流量與誤檔漏檔之個案研究

標題:	萃取、分類及匿名封包流量與誤檔漏檔之個案研究 Extracting, Classifying and Anonymizing Packet Traces with Case Studies on False Positive/Negative Assessment
作者:	王聲浩 Wang, Sheng-Hao 林盈達 Lin, Ying-Dar 網路工程研究所
關鍵字:	流量資料庫;流量分類;封包匿名;誤擋;漏擋;trace repository;traffic classification;packet anonymization;false positive;false negative
公開日期:	2009
摘要:	真實網路流量是許多網路相關研究與發展的重要資產，一個具有完整分類的網路流量資料集，可提供研究人員依其需求快速的選擇所需流量類別，此舉有效的降低研究流程在流量採集上的成本與時間。然而公開網路流量所需面臨的風險是使用者個人隱私的洩漏，目前公開網路流量的組織大多依賴贊助者自行上傳且不保障網路流量內的隱私性資料安全。因此本研究旨在提供一個具萃取、分類及匿名的網路流量資料庫在此稱之為PCAP Lib，PCAP Lib包含三個目標，第一，利用具偵測流量功能的網路設備取其紀錄用於萃取與分類出十種不同類型且進一步區分出帶有惡意或良好的網路流量。第二，現今的匿名技術大多著重於TCP/IP 標頭的欄位，即使有些能處理payload，但解析大量的網路應用層協定仍是一項艱難的議題，在此我們採用深度匿名的方式，確保個人的隱私資料。此外，在網路流量測試中，參與測試的設備其偵測率並無法達到百分之分的準確，因此我們第三個目標是設計一套誤擋與漏擋的分析流程整合於PCAP Lib 中。在五個月的網路流量中我們主動蒐集323筆不同性質的網路流量樣本，其中33%為正常流量，67%為惡意流量。在匿名方法中，我們定義出privacy/utility及efficiency用於評量匿名的方法，本文所提出的匿名策略有效性達到93%，優於其他方法的27%和33%。在誤擋與漏擋的分析上，觀察出63%影響誤擋的主因在於P2P類型其動態埠的流量性質，常使偵測設備誤認為其他類型的應用協定，而62%漏擋的主因在於設備中的特徵資料庫不具可用於比對的特徵值。 Well-classified packet traces make researchers to pick up the class of traces what they want quickly. However, opening packet trace might expose the user’s privacy information and let attackers use to exploit. This work aims to provide extracting, classifying, and anonymizing packet traces. We propose PCAP Lib framework which achieves three goals. First, actively extract healthiness and malicious traces from real-world traffic to classify into 10 types of application by multiple detection devices logs. Second, we present an anonymization method to protect personal privacy through deep packet anonymization (DPA). Besides, no one detection device can provide 100% accuracy under packet trace testing; we design an analysis procedure to investigate the cause of false positive (FP)/ false negative (FN) in devices and find out the frequent cases as the third goal. In the result, we collect 323 distinctive packet traces in five months. Among them, 33% are healthy and 67% are malicious. In anonymization, we define “privacy/utility” and “efficiency” to evaluate the different anonymization methods. DPA achieves the best efficiency of 93% than other 27% or 33%. In FP/FN case studies, 63% of FP causes are due to traffic similarity and 62% of FN causes are due to signature insufficiency.
URI:	http://140.113.39.130/cdrfb3/record/nctu/#GT079756527 http://hdl.handle.net/11536/46018
Appears in Collections:	Thesis

Files in This Item:

652701.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.