完整後設資料紀錄
DC 欄位語言
dc.contributor.author張晨輝en_US
dc.contributor.authorJhang, Chen-Hueien_US
dc.contributor.author梁婷en_US
dc.contributor.authorLiang, Tyneen_US
dc.date.accessioned2014-12-12T01:59:24Z-
dc.date.available2014-12-12T01:59:24Z-
dc.date.issued2011en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT079955615en_US
dc.identifier.urihttp://hdl.handle.net/11536/50523-
dc.description.abstract專有名詞翻譯的成效影響許多自然語言處理的應用,例如跨語言資料檢索、機器翻譯、與自動問答系統等。由於網路資源豐富且更新迅速,近年專有名詞翻譯研究多利用搜索引擎回傳的網頁片段萃取翻譯候選詞,並根據候選詞與專有名詞在搜尋結果中的頻率、距離與二者的詞長比例等特徵,使用監督式學習模組或非監督式學習排選候選詞。有鑑於各領域的專有名詞有各自的命名規則,而先前研究較少考慮此點,因此本論文提出利用搜尋結果萃取翻譯候選詞並以命名規則協助搜尋詞擴展與候選詞評量。 在本論文中,我們考量四個領域的英對中譯名,分別是書名、電影名、醫藥名、和公司名等。所提的方法分三個階段進行:首先,我們使用13種特徵並以支援向量機模組(SVM)進行專有名詞領域辨識;然後,根據已定義好的領域命名規則做搜尋詞擴展;最後,我們利用制定好的表面樣式萃取候選詞,且依造頻率與命名規則排序候選詞。在實驗中,我們測試 3315筆名稱,以排序第一的候選詞即為正確翻譯的機率可達到82.3%。 關鍵字: 實體名稱翻譯、機器翻譯、網路、自然語言處理zh_TW
dc.description.abstractName Entity translation plays an important role in many natural language processing (NLP) applications, such as machine translation, cross-language information retrieval, and question answering. With rich web information, many previous researches have employed with web resources, and search results. However, naming rules for the translating in domains are not concerned in most previous researches. In this thesis, we proposed an approach based on extracted translations from search results and considered naming rules for query expansion and translation candidate evaluation. In this thesis, we extracted translations of name entities in four categories, namely, book, movie, medicine, and company. The proposed approach was implemented in three steps. We extracted features and identified name entities using support vector machine. Then, we applied pre-defined naming rules for different types of entities to expand queries with the purpose to require more relevant results. Finally, we extracted translation candidates by defined surface patterns and evaluated candidates. From the experiment results, the proposed approach yielded 82.3% accuracy of average top-1 inclusion rate. Keyword: named entity translation, machine translation, web-based, natural language processingen_US
dc.language.isoen_USen_US
dc.subject實體名稱翻譯zh_TW
dc.subject機器翻譯zh_TW
dc.subject網路zh_TW
dc.subject自然語言處理zh_TW
dc.subjectnamed entity translationen_US
dc.subjectmachine translationen_US
dc.subjectweb-baseden_US
dc.subjectnatural language processingen_US
dc.title以網路為主之英對中專有名詞翻譯萃取zh_TW
dc.titleEmpirical Approach to Resolving English to Chinese Named Entity Translationen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
顯示於類別:畢業論文


文件中的檔案:

  1. 561501.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。