標題: Bilingual sentence alignment based on punctuation statistics and lexicon
作者: Chuang, TC
Wu, JC
Lin, T
Shei, WC
Chang, JS
電信工程研究所
Institute of Communications Engineering
公開日期: 2005
摘要: This paper presents a new method of aligning bilingual parallel texts based on punctuation statistics and lexical information, It is demonstrated that the punctuation statistics prove to be effective means to achieve good results. The task of sentence alignment of bilingual texts written in disparate language pairs like English and Chinese is reportedly more difficult. We examine the feasibility of using punctuations for high accuracy sentence alignment. Encouraging precision rate is demonstrated in aligning sentences in bilingual parallel corpora based solely on punctuation statistics. Improved results were obtained when both punctuation statistics and lexical information were employed. We have experimented with an implementation of the proposed method on the parallel corpora of Sinorama Magazine and Records of the Hong Kong Legislative Council with satisfactory results.
URI: http://hdl.handle.net/11536/25099
ISBN: 3-540-24475-1
ISSN: 0302-9743
期刊: NATURAL LANGUAGE PROCESSING - IJCNLP 2004
Volume: 3248
起始頁: 224
結束頁: 232
顯示於類別:會議論文