以網路為主之英對中專有名詞翻譯萃取

Full metadata record

DC Field	Value	Language
dc.contributor.author	張晨輝	en_US
dc.contributor.author	Jhang, Chen-Huei	en_US
dc.contributor.author	梁婷	en_US
dc.contributor.author	Liang, Tyne	en_US
dc.date.accessioned	2014-12-12T01:59:24Z	-
dc.date.available	2014-12-12T01:59:24Z	-
dc.date.issued	2011	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT079955615	en_US
dc.identifier.uri	http://hdl.handle.net/11536/50523	-
dc.description.abstract	專有名詞翻譯的成效影響許多自然語言處理的應用，例如跨語言資料檢索、機器翻譯、與自動問答系統等。由於網路資源豐富且更新迅速，近年專有名詞翻譯研究多利用搜索引擎回傳的網頁片段萃取翻譯候選詞，並根據候選詞與專有名詞在搜尋結果中的頻率、距離與二者的詞長比例等特徵，使用監督式學習模組或非監督式學習排選候選詞。有鑑於各領域的專有名詞有各自的命名規則，而先前研究較少考慮此點，因此本論文提出利用搜尋結果萃取翻譯候選詞並以命名規則協助搜尋詞擴展與候選詞評量。在本論文中，我們考量四個領域的英對中譯名，分別是書名、電影名、醫藥名、和公司名等。所提的方法分三個階段進行:首先，我們使用13種特徵並以支援向量機模組(SVM)進行專有名詞領域辨識;然後，根據已定義好的領域命名規則做搜尋詞擴展;最後，我們利用制定好的表面樣式萃取候選詞，且依造頻率與命名規則排序候選詞。在實驗中，我們測試 3315筆名稱，以排序第一的候選詞即為正確翻譯的機率可達到82.3%。關鍵字: 實體名稱翻譯、機器翻譯、網路、自然語言處理	zh_TW
dc.description.abstract	Name Entity translation plays an important role in many natural language processing (NLP) applications, such as machine translation, cross-language information retrieval, and question answering. With rich web information, many previous researches have employed with web resources, and search results. However, naming rules for the translating in domains are not concerned in most previous researches. In this thesis, we proposed an approach based on extracted translations from search results and considered naming rules for query expansion and translation candidate evaluation. In this thesis, we extracted translations of name entities in four categories, namely, book, movie, medicine, and company. The proposed approach was implemented in three steps. We extracted features and identified name entities using support vector machine. Then, we applied pre-defined naming rules for different types of entities to expand queries with the purpose to require more relevant results. Finally, we extracted translation candidates by defined surface patterns and evaluated candidates. From the experiment results, the proposed approach yielded 82.3% accuracy of average top-1 inclusion rate. Keyword: named entity translation, machine translation, web-based, natural language processing	en_US
dc.language.iso	en_US	en_US
dc.subject	實體名稱翻譯	zh_TW
dc.subject	機器翻譯	zh_TW
dc.subject	網路	zh_TW
dc.subject	自然語言處理	zh_TW
dc.subject	named entity translation	en_US
dc.subject	machine translation	en_US
dc.subject	web-based	en_US
dc.subject	natural language processing	en_US
dc.title	以網路為主之英對中專有名詞翻譯萃取	zh_TW
dc.title	Empirical Approach to Resolving English to Chinese Named Entity Translation	en_US
dc.type	Thesis	en_US
dc.contributor.department	資訊科學與工程研究所	zh_TW
Appears in Collections:	Thesis

Files in This Item:

561501.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.