标题: 一个具使用者图形介面之网页资料萃取系统
A GUI Based Enviroment For Web Data Extraction
作者: 黄敬尧
John Huang
吴毅成
I-Chen Wu
资讯学院资讯学程
关键字: 网页资料萃取;Web data extraction
公开日期: 2000
摘要: 网际网路与电子商务的快速发展,给企业以及个人均带来了许多的好处,人们花费在上网浏览网页的时间也越来越长。但是有时候却感觉到迷失在众多各式各样不同的资讯之中,因此必须有一套可以快速且有系统的帮使用者搜集资讯的机制。
一些网页查询语言,例如XML-QL、WIDL、GIDL,就是这样用来提供使用者自动化萃取网页的工具,但是针对设计上太复杂的网页,要很快写出其查询语法却有困难。因此本论文以GIDL网页查询语言为基础,提出并设计一套使用者易于上手的网页资料萃取系统。有了这个结合图形操作介面的资料萃取工具,可以加速网页资料之萃取与搜集,并且提升网页资料萃取之效率与降低人力上的成本。
The rapid growth of Internet and Electronic-Commerce has the potential to provide enormous benefits to business and consumers. People spending more and more time navigating the Web sites. But some times they feel they were lost when dealing with large amount of data and information. There must be a mechanism to help people acquiring data more systematically and much more quickly.
The Web query languages such as XML-QL, WIDL, GIDL provide users with collecting data from the Web server automatically. But it is difficult to write sophisticated syntax for specific Web sites. In this thesis we introduce a user-friendly tool based on GIDL for Web data extraction. The GUI based Web data extraction tool will accelerating the process of extracting data from the Web. The utility will also increase the performance for Web data extraction and reduce the cost.
URI: http://140.113.39.130/cdrfb3/record/nctu/#NT891706008
http://hdl.handle.net/11536/68031
显示于类别:Thesis