完整后设资料纪录
DC 栏位 | 值 | 语言 |
---|---|---|
dc.contributor.author | 廖赞玮 | en_US |
dc.contributor.author | Liao, Zan-Wei | en_US |
dc.contributor.author | 李嘉晃 | en_US |
dc.contributor.author | Lee, Chia-Hoang | en_US |
dc.date.accessioned | 2014-12-12T01:34:36Z | - |
dc.date.available | 2014-12-12T01:34:36Z | - |
dc.date.issued | 2008 | en_US |
dc.identifier.uri | http://140.113.39.130/cdrfb3/record/nctu/#GT079657520 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/43528 | - |
dc.description.abstract | 由于科技的进步,网路的发展,造成资讯量迅速攀升,然而这样的进步却相对的造成使用者必须付出更多的时间去浏览所需的文件。有鉴于现今搜寻引擎的广泛使用,人们希望以更高的效率与效能取得资讯,其中自动摘要技术与其后衍生的分類应用,扮演着重要的角色。在搜寻的过程中,若能搭配自动摘要之方法,则可让使用者根据摘要的内容去判读是否要读取这篇文章。如此一来,不仅可以减少使用者浏览文件的时间,更可加快使用者搜寻的速度。 本研究利用Yahoo新闻网之新闻内容、中央研究院词性分类集做分析,萃取出核心关键词,并将句子转换成关键词串列。利用中文语法之特性、同义词词库,对核心关键词做关键词扩展之动作。接着,利用扩充完之关键词集合做为挑出关键词摘要之依据,并利用[Yihong Gong, Xin Liu, 2001]提出之概念,挑选出潜藏语意分析之摘要。本研究将上述两种摘要结果做整合且考虑可读性,产生一篇摘要提供使用者阅读。 | zh_TW |
dc.description.abstract | As with the popularity of internet, information overloading has become a major problem and people have to spend more and more time to look for the information they need. In recent years, search engine has been used in many ways for many purposes, so a system which could reduce the amount of the content without losing the principle meaning of the content is necessary. In this research, the application domain is Internet News summarization and the data corpus was collected from Yahoo. We make use of CKIP (Chinese Knowledge and Information Processing) to perform POS tagging task. Based on the POS tagging information, the system analyzes and extracts the core keywords and makes a transition from a sentence to a keyword string. Then keywords expansion is performed based on the Chinese semantic architecture and HowNet. After the expansion, each core keyword will be given a weight according to its type. Then, the weight of each sentence will be obtained by the summation of the weights of the keywords in the sentence. Based on the sentence weighting information, the sentences could be ranked to obtain a core summary set. Also, We use the idea of linear algebra provided by [Yihong Gong, Xin Liu, 2001] to make an assistant summary set and get information that may be missed by using topic based way to make our summary more completely. Finally, the system integrates two summary sets mentioned above to make a summary and takes into account readability issue to make the whole summary become fluent. | en_US |
dc.language.iso | zh_TW | en_US |
dc.subject | 自动摘要 | zh_TW |
dc.subject | 中文自动摘要 | zh_TW |
dc.subject | 新闻自动摘要 | zh_TW |
dc.subject | summarization | en_US |
dc.subject | text summarization | en_US |
dc.subject | automatic text summarization | en_US |
dc.subject | summarization for news | en_US |
dc.title | 中文新闻自动摘要系统 | zh_TW |
dc.title | Automatic Text Summarization System for Chinese News | en_US |
dc.type | Thesis | en_US |
dc.contributor.department | 多媒体工程研究所 | zh_TW |
显示于类别: | 毕业论文 |