Full metadata record
DC FieldValueLanguage
dc.contributor.authorSu, Jui-Yuanen_US
dc.contributor.authorSun, Der-Johngen_US
dc.contributor.authorWu, I-Chenen_US
dc.contributor.authorChen, Lung-Pinen_US
dc.date.accessioned2014-12-08T15:07:09Z-
dc.date.available2014-12-08T15:07:09Z-
dc.date.issued2010-04-01en_US
dc.identifier.issn1023-2796en_US
dc.identifier.urihttp://hdl.handle.net/11536/5616-
dc.description.abstractWeb data extraction systems currently not only extract data on web pages but also need to navigate to the target correctly. Most traditional web data extraction systems extract URLs directly from web pages, and then access next pages using the extracted URLs. This data extraction approach is herein called the URL-oriented data extraction approach in this paper. However, currently, more and more web pages use script functions, such as Java Script, to access next pages and may hide URLs inside these functions, making it difficult to extract URLs. In order to solve this problem, a new data extraction approach, named the browser-oriented data extraction (BODE) approach, is proposed to be built on top of browser objects access pages by simulating users' operations on browsers to invoke script functions. A data extraction system based on this approach is called the BODE system. Based on the BODE approach, this paper designed a BODE system with the following contributions: (a) Define a scripting language, named the BODED (Browser-Oriented Data Extraction Description) language, which instructs the BODE system to extract data. (b) Design a plug-ins that can be used to extend the functionalities of the BODE system. (c) Design a visualization tool to support the data extraction in the BODE system. (d) Illustrate the plug-in mechanism of the BODE system by automating the playing of the game Connect6 over an Internet game site.en_US
dc.language.isoen_USen_US
dc.subjectdata extraction modelen_US
dc.subjectbrowser-oriented data extractionen_US
dc.subjectURL-oriented data extractionen_US
dc.subjectplug-insen_US
dc.subjectvisualization toolsen_US
dc.titleON DESIGN OF BROWSER-ORIENTED DATA EXTRACTION SYSTEM AND THE PLUG-INSen_US
dc.typeArticleen_US
dc.identifier.journalJOURNAL OF MARINE SCIENCE AND TECHNOLOGY-TAIWANen_US
dc.citation.volume18en_US
dc.citation.issue2en_US
dc.citation.spage189en_US
dc.citation.epage200en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000277269800005-
dc.citation.woscount1-
Appears in Collections:Articles