Extraction of name and transliteration in monolingual and parallel corpora

Full metadata record

DC Field	Value	Language
dc.contributor.author	Lin, T	en_US
dc.contributor.author	Wu, JC	en_US
dc.contributor.author	Chang, JS	en_US
dc.date.accessioned	2014-12-08T15:39:48Z	-
dc.date.available	2014-12-08T15:39:48Z	-
dc.date.issued	2004	en_US
dc.identifier.isbn	3-540-23300-8	en_US
dc.identifier.issn	0302-9743	en_US
dc.identifier.uri	http://hdl.handle.net/11536/27197	-
dc.description.abstract	Named-entities in free text represent a challenge to text analysis in Machine Translation and Cross Language Information Retrieval. These phrases are often transliterated into another language with a different sound inventory and writing system. Named-entities found in free text are often not listed in bilingual dictionaries. Although it is possible to identify and translate named-entities on the fly without a list of proper names and transliterations, an extensive list of existing transliterations certainly will ensure high precision rate. We use a seed list of proper names and transliterations to train a Machine Transliteration Model. With the model it is possible to extract proper names and their transliterations in monolingual or parallel corpora with high precision and recall rates.	en_US
dc.language.iso	en_US	en_US
dc.title	Extraction of name and transliteration in monolingual and parallel corpora	en_US
dc.type	Article; Proceedings Paper	en_US
dc.identifier.journal	MACHINE TRANSLATION: FROM REAL USERS TO RESEARCH, PROCEEDINGS	en_US
dc.citation.volume	3265	en_US
dc.citation.spage	177	en_US
dc.citation.epage	186	en_US
dc.contributor.department	電信工程研究所	zh_TW
dc.contributor.department	Institute of Communications Engineering	en_US
dc.identifier.wosnumber	WOS:000224611600020	-
Appears in Collections:	Conferences Paper