Title: A web data extraction description language and its implementation
Authors: Wu, IC
Su, JY
Chen, TB
資訊工程學系
Department of Computer Science
Issue Date: 2005
Abstract: A data extraction model, named the browser-oriented data extraction (BODE) model, was proposed in [14] to extract web contents with script functions. In this model, the system built on top of browsers accesses pages by simulating users' operations on browsers. Based on this model, this paper defines a scripting language, named the BODED (Browser-Oriented Data Extraction Description) language, which instructs the system how to do data extraction. This paper propose. a technique, called indirect browser replication to implement a BODE system, and also optimize the performance of this technique.
URI: http://hdl.handle.net/11536/17954
ISBN: 0-7695-2413-3
ISSN: 0730-3157
Journal: Proceedings of the 29th Annual International Computer Software and Applications Conference
Begin Page: 293
End Page: 298
Appears in Collections:Conferences Paper