標題: | 利用網路探勘之中英專名萃取研究 BILINGUAL PROPER NOUNS EXTRACTION THROUGH WEB MINING |
作者: | 蘇傳堯 Chuan-Yao Su 梁婷 Tyne Liang 資訊科學與工程研究所 |
關鍵字: | 專名;未知詞;網路探勘;搜尋詞擴展;proper noun;OOV terms;Web mining;query expansion |
公開日期: | 2005 |
摘要: | 專名翻譯的研究可以幫助解決許多自然語言領域的問題,如自動問答系統、機器翻譯、以及跨語言資訊擷取。以往研究著重在利用平衡語料庫或字典來完成,而隨著網路資源的普及,利用網路資源的研究也越來越多。本論文提出了一套整合性的方法,利用網頁資源當作語料庫來完成中英專名翻譯,其中包括搜尋詞擴展和利用事先蒐集好的表面樣式來幫助擷取翻譯候選詞。最後再用我們提出的公式排序翻譯候選詞並得到最後的翻譯結果。在實驗中,我們測試了1376筆專有名詞,在英翻中部分,當名次第一的翻譯候選詞即是正確翻譯的機率可達到87%。在中翻英的部份,當名次第一的翻譯候選詞即是正確翻譯的機率可達到83%。 Proper noun translation plays significant role in many natural language applications, such as question answering, machine translation, cross-language information retrieval. Traditional researches of bilingual term extraction focus on utilizing parallel/comparable texts or general dictionaries. Today the Web becomes the largest resource and is utilized in recent researches. This thesis proposes an integrated extraction method to employ query expansion, surface-patterns mined from web corpus, and new ranking scheme to improve bilingual term extraction. Experimental results on 1376 proper nouns show that the presented extraction can achieve 87% accuracy for English-to-Chinese extraction, and 83% for Chinese-to- English extraction. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009323597 http://hdl.handle.net/11536/79128 |
Appears in Collections: | Thesis |
Files in This Item:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.