Full metadata record
DC FieldValueLanguage
dc.contributor.authorChang, THen_US
dc.contributor.authorLee, CHen_US
dc.date.accessioned2014-12-08T15:26:20Z-
dc.date.available2014-12-08T15:26:20Z-
dc.date.issued2003en_US
dc.identifier.isbn0-7803-7902-0en_US
dc.identifier.urihttp://hdl.handle.net/11536/18707-
dc.description.abstractChinese unknown word extraction is an important problem for Chinese language processing. There are troublesome difficulties in the problem. First, almost any Chinese character can either represent a word or be a part of other words. Secondly, there is no blank between Chinese words for identifying the boundaries. Although some approaches have been proposed, there are some drawbacks in these methods. In this paper, we present and develop a method to extract Chinese unknown words more efficiently and precisely. It retains efficiency and accuracy even though the size of document set is small for training. It can also extract the unknown words occur rarely. Based on these advantages, it is very practical for real applications.en_US
dc.language.isoen_USen_US
dc.subjectChinese unknown worden_US
dc.subjectCorpus-based methoden_US
dc.titleAutomatic Chinese unknown word extraction using small-corpus-based methoden_US
dc.typeProceedings Paperen_US
dc.identifier.journal2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGSen_US
dc.citation.spage459en_US
dc.citation.epage464en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000189300200077-
Appears in Collections:Conferences Paper