標題: | 經由片語重排做英對中機器翻譯 English to Chinese Machine Translation through Phrase Reordering |
作者: | 張晉榮 Chang, Chin-Jung 梁婷 Liang, Tyne 資訊科學與工程研究所 |
關鍵字: | 機器翻譯;片語重排;Machine Translation;Phrase reordering |
公開日期: | 2011 |
摘要: | 機器翻譯已經有很多重要的應用,例如降低人工翻譯的成本、減輕語言在地化的工作、輔助語言學習等。一般的統計式片語翻譯系統如Moses,是利用語言模型和片語表進行翻譯,但由於沒有考慮到來源和目標語句在詞彙順序上的不同,因此,翻譯的結果仍可做進一步的提升。目前有些研究已提出使用詞彙片語資訊重排來源語的詞彙,使得來源語句的詞彙順序更接近目標語。但由於每個句子涵蓋的詞彙資訊重複性太低,造成長距離片語重排困難,所以也有學者提出使用句法樹的句法資訊和詞性對來源語句做片語重排。但這些研究並沒有探討重排模型在各種不同片語最大長度下翻譯的成效。因此,本研究提出階層式的詞彙、句法、混合二者的片語重排模型和句法樹片語重排模型。在實驗中,我們以1085英文句子分別對這四種模型進行翻譯實驗,結果顯示所提出的方法在不同長度的片語翻譯都得到較佳的BLEU score。 There are many important applications for machine translation language learning assistance and cross-lingual information retrieval. Some statistic phrase-based translation systems, e.g, Moses, translate sentences on the basis of a language model and a phrase table. Such systems do not consider word order difference between source sentences and target sentences. Some previous researches suggest that reordering source words with lexicon phrase information will increase the structure similarity between source and target sentences. However, it is difficult to do long distance reordering due to the shortage of lexicon information in sentences. In order to resolve this issue, some researches suggest to reorder source phrases using syntax trees and part of speeches. However, translation performance with respect to different maximum phrase lengths was not considered in their results. In this thesis, hierarchical reordering models using lexical and syntactic phrases and syntax tree phrase reordering model are proposed and discussed. In the experiments, the proposed approaches were verified on the real corpus containing 1085 English sentences. Experimental results showed that our approaches outperformed the previous models in different kinds of phrase lengths. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079955523 http://hdl.handle.net/11536/50439 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.