完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Yeh, JY | en_US |
dc.contributor.author | Ke, HR | en_US |
dc.contributor.author | Yang, WP | en_US |
dc.contributor.author | Meng, IH | en_US |
dc.date.accessioned | 2014-12-08T15:36:26Z | - |
dc.date.available | 2014-12-08T15:36:26Z | - |
dc.date.issued | 2005-01-01 | en_US |
dc.identifier.issn | 0306-4573 | en_US |
dc.identifier.uri | http://dx.doi.org/10.1016/j.ipm.2004.04.003 | en_US |
dc.identifier.uri | http://hdl.handle.net/11536/24779 | - |
dc.description.abstract | This paper proposes two approaches to address text summarization: modified corpus-based approach (MCBA) and LSA-based T.R.M. approach (LSA + T.R.M.). The first is a trainable summarizer, which takes into account several features, including position, positive keyword, negative keyword, centrality, and the resemblance to the title, to generate summaries. Two new ideas are exploited: (1) sentence positions are ranked to emphasize the significances of different sentence positions, and (2) the score function is trained by the genetic algorithm (GA) to obtain a suitable combination of feature weights. The second uses latent semantic analysis (LSA) to derive the semantic matrix of a document or a corpus and uses semantic sentence representation to construct a semantic text relationship map. We evaluate LSA + T.R.M. both with single documents and at the corpus level to investigate the competence of LSA in text summarization. The two novel approaches were measured at several compression rates on a data corpus composed of 100 political articles. When the compression rate was 30%, an average f-measure of 49% for MCBA, 52% for MCBA + GA, 44% and 40% for LSA + T.R.M. in single-document and corpus level were achieved respectively. (C) 2004 Elsevier Ltd. All rights reserved. | en_US |
dc.language.iso | en_US | en_US |
dc.subject | text summarization | en_US |
dc.subject | corpus-based approach | en_US |
dc.subject | latent semantic analysis | en_US |
dc.subject | text relationship map | en_US |
dc.title | Text summarization using a trainable summarizer and latent semantic analysis | en_US |
dc.type | Article; Proceedings Paper | en_US |
dc.identifier.doi | 10.1016/j.ipm.2004.04.003 | en_US |
dc.identifier.journal | INFORMATION PROCESSING & MANAGEMENT | en_US |
dc.citation.volume | 41 | en_US |
dc.citation.issue | 1 | en_US |
dc.citation.spage | 75 | en_US |
dc.citation.epage | 95 | en_US |
dc.contributor.department | 資訊工程學系 | zh_TW |
dc.contributor.department | 圖書館 | zh_TW |
dc.contributor.department | Department of Computer Science | en_US |
dc.contributor.department | Library | en_US |
dc.identifier.wosnumber | WOS:000224486000006 | - |
顯示於類別: | 會議論文 |