Full metadata record
DC FieldValueLanguage
dc.contributor.authorLiang, Ten_US
dc.contributor.authorShih, PKen_US
dc.date.accessioned2014-12-08T15:37:08Z-
dc.date.available2014-12-08T15:37:08Z-
dc.date.issued2005en_US
dc.identifier.isbn3-540-26031-5en_US
dc.identifier.issn0302-9743en_US
dc.identifier.urihttp://hdl.handle.net/11536/25518-
dc.description.abstractNamed Entity Recognition (NER) from biomedical literature is crucial in biomedical knowledge base automation. In this paper, both empirical rule and statistical approaches to protein entity recognition are presented and investigated on a general corpus GENIA 3.02p and a new domain-specific corpus SRC. Experimental results show the rules derived from SRC are useful though they are simpler and more general than the one used by other rule-based approaches. Meanwhile, a concise HMM-based model with rich set of features is presented and proved to be robust and competitive while comparing it to other successful hybrid models. Besides, the resolution of coordination variants common in entities recognition is addressed. By applying heuristic rules and clustering strategy, the presented resolver is proved to be feasible.en_US
dc.language.isoen_USen_US
dc.titleEmpirical textual mining to protein entities recognition from PubMed corpusen_US
dc.typeArticle; Proceedings Paperen_US
dc.identifier.journalNATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PROCEEDINGSen_US
dc.citation.volume3513en_US
dc.citation.spage56en_US
dc.citation.epage66en_US
dc.contributor.department資訊工程學系zh_TW
dc.contributor.departmentDepartment of Computer Scienceen_US
dc.identifier.wosnumberWOS:000230413100006-
Appears in Collections:Conferences Paper