完整後設資料紀錄
DC 欄位語言
dc.contributor.author楊瑞敏en_US
dc.contributor.authorYang, Ruin-Minen_US
dc.contributor.author李嘉晃en_US
dc.contributor.authorLee, Chia-Hoangen_US
dc.date.accessioned2014-12-12T01:44:17Z-
dc.date.available2014-12-12T01:44:17Z-
dc.date.issued2009en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079757552en_US
dc.identifier.urihttp://hdl.handle.net/11536/46089-
dc.description.abstract根據研究報告指出,網際網路的蓬勃發展造成每年產生的數位化文件與影像等資料之總數皆呈倍數成長。 為了有效率地了解這些電子文件的資訊,本論文發展自動摘要系統將這些大量的數位化文件去蕪存菁,在不流失其原本的資訊的條件下,讓使用者快速且有效地了解這些資訊的內容。 本論文所提出的自動摘要系統考慮了三個不同面向來對句子作評分以作為挑選摘要句子的依據:1. 字詞與句子之間的關係;2. 標題與句子之間的關係;3. 句子與句子之間的關係。在對句子評分之前,本系統利用Alignment演算法與Mutual Reinforcement原理移除資料集中資訊量較低的句子,以避免這些低資訊量的句子被選取成摘要句子。 而上述所提及的三個不同面向則是分別利用HITS演算法、餘弦相似度計算方法與PageRank演算法來實現。 本論文使用的資料集為DUC資料集,其為英文資料集且組成文件為新聞類文章。 根據ROUGE評估工具的評估結果顯示,本摘要系統所產生的系統摘要達到不錯的效能。zh_TW
dc.description.abstractAccording to the research report, the rapid development of the Internet results in the amount of the digital document, video, or other data to grow in double rate per year. In order to find out the information of these electronic files efficiently, this thesis develops an automatic summarization system to sieve out the non-information data of digital documents. Therefore, users can find out the contents of information efficiently without losing the meaning of the original documents. The automatic summarization system proposed in this thesis considers three different aspects for the sentence scoring: first, the relationship between words and sentences; second, the relationship between the titles and sentences; finally, the relationship between sentences and sentences. Before the sentences scoring, this summarization system uses Alignment algorithm and Mutual Reinforcement Principle to remove the sentences that have fewer information on the original dataset to avoid these sentences with fewer information to be selected as a part of the summary. The HITS algorithm, the cosine similarity calculation methods and the PageRank algorithm are employed respectively to achieve the above three different aspects. The dataset used in this thesis is the DUC dataset, and the constituent documents of the DUC dataset are the English news articles. The evaluation results of the evaluation tools ROUGE show the performance of the summary generate by this summarization system is good.en_US
dc.language.isozh_TWen_US
dc.subject多文件zh_TW
dc.subject摘要系統zh_TW
dc.subjectMutual Reinforcement原理zh_TW
dc.subjectMulti-Documenten_US
dc.subjectSummarization Systemen_US
dc.subjectMutual Reinforcement Principleen_US
dc.title多文件摘要系統基於Mutual Reinforcement原理zh_TW
dc.titleMulti-Document Summarization System Based on Mutual Reinforcement Principleen_US
dc.typeThesisen_US
dc.contributor.department多媒體工程研究所zh_TW
顯示於類別:畢業論文


文件中的檔案:

  1. 755201.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。