ON DESIGN OF BROWSER-ORIENTED DATA EXTRACTION SYSTEM AND THE PLUG-INS

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Su, Jui-Yuan	en_US
dc.contributor.author	Sun, Der-Johng	en_US
dc.contributor.author	Wu, I-Chen	en_US
dc.contributor.author	Chen, Lung-Pin	en_US
dc.date.accessioned	2014-12-08T15:07:09Z	-
dc.date.available	2014-12-08T15:07:09Z	-
dc.date.issued	2010-04-01	en_US
dc.identifier.issn	1023-2796	en_US
dc.identifier.uri	http://hdl.handle.net/11536/5616	-
dc.description.abstract	Web data extraction systems currently not only extract data on web pages but also need to navigate to the target correctly. Most traditional web data extraction systems extract URLs directly from web pages, and then access next pages using the extracted URLs. This data extraction approach is herein called the URL-oriented data extraction approach in this paper. However, currently, more and more web pages use script functions, such as Java Script, to access next pages and may hide URLs inside these functions, making it difficult to extract URLs. In order to solve this problem, a new data extraction approach, named the browser-oriented data extraction (BODE) approach, is proposed to be built on top of browser objects access pages by simulating users' operations on browsers to invoke script functions. A data extraction system based on this approach is called the BODE system. Based on the BODE approach, this paper designed a BODE system with the following contributions: (a) Define a scripting language, named the BODED (Browser-Oriented Data Extraction Description) language, which instructs the BODE system to extract data. (b) Design a plug-ins that can be used to extend the functionalities of the BODE system. (c) Design a visualization tool to support the data extraction in the BODE system. (d) Illustrate the plug-in mechanism of the BODE system by automating the playing of the game Connect6 over an Internet game site.	en_US
dc.language.iso	en_US	en_US
dc.subject	data extraction model	en_US
dc.subject	browser-oriented data extraction	en_US
dc.subject	URL-oriented data extraction	en_US
dc.subject	plug-ins	en_US
dc.subject	visualization tools	en_US
dc.title	ON DESIGN OF BROWSER-ORIENTED DATA EXTRACTION SYSTEM AND THE PLUG-INS	en_US
dc.type	Article	en_US
dc.identifier.journal	JOURNAL OF MARINE SCIENCE AND TECHNOLOGY-TAIWAN	en_US
dc.citation.volume	18	en_US
dc.citation.issue	2	en_US
dc.citation.spage	189	en_US
dc.citation.epage	200	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000277269800005	-
dc.citation.woscount	1	-
顯示於類別：	期刊論文