標題: An efficient pattern matching scheme in LZW compressed sequences
作者: Lee, Tsern-Huei
Huang, Nai-Lun
電信工程研究所
Institute of Communications Engineering
關鍵字: bit-parallelism;compressed pattern matching;information search and retrieval;LZW compression;malwares detection;string matching
公開日期: 1-七月-2008
摘要: Compressed pattern matching (CPM) is an emerging research field addressing the problem: given a compressed sequence and a pattern, process the sequence with minimal (or no) decompression to find the pattern occurrence(s) in the uncompressed sequence. It can be applied to detect malwares and confidential information leakage in compressed files directly. In this paper, we report our work of CPM in Lempel-Ziv-Welch (LZW) compressed sequences. We propose an efficient bitmap-based realization of the Amir-Benson-Farach algorithm. We also generalize the algorithm to find all pattern occurrences and report their absolute positions in the uncompressed sequence. Experiments are conducted to test the space requirements of our proposed generalization and two related CPM schemes which can also be realized with bitmaps. Results show that our proposed generalization requires the least amount of storage for moderate and long patterns. We also conduct experiments to compare the throughput performance of our proposed generalization with these two related CPM schemes and the decompress-then-search scheme. Results show that our proposed generalization outperforms the decompress-then-search scheme significantly. When scanning a file with pattern occurrences, our proposed generalization performs slightly better than the two related CPM schemes. The difference is significant when scanning a file with no pattern occurrence. Copyright (c) 2008 John Wiley & Sons, Ltd.
URI: http://dx.doi.org/10.1002/sec.32
http://hdl.handle.net/11536/8693
ISSN: 1939-0114
DOI: 10.1002/sec.32
期刊: SECURITY AND COMMUNICATION NETWORKS
Volume: 1
Issue: 4
起始頁: 325
結束頁: 335
顯示於類別:期刊論文


文件中的檔案:

  1. 000207480100006.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。