標題: | A web data extraction description language and its implementation |
作者: | Wu, IC Su, JY Chen, TB 資訊工程學系 Department of Computer Science |
公開日期: | 2005 |
摘要: | A data extraction model, named the browser-oriented data extraction (BODE) model, was proposed in [14] to extract web contents with script functions. In this model, the system built on top of browsers accesses pages by simulating users' operations on browsers. Based on this model, this paper defines a scripting language, named the BODED (Browser-Oriented Data Extraction Description) language, which instructs the system how to do data extraction. This paper propose. a technique, called indirect browser replication to implement a BODE system, and also optimize the performance of this technique. |
URI: | http://hdl.handle.net/11536/17954 |
ISBN: | 0-7695-2413-3 |
ISSN: | 0730-3157 |
期刊: | Proceedings of the 29th Annual International Computer Software and Applications Conference |
起始頁: | 293 |
結束頁: | 298 |
顯示於類別: | 會議論文 |