完整後設資料紀錄
DC 欄位語言
dc.contributor.authorHsieh, Sheau-Lingen_US
dc.contributor.authorChang, Wen-Yungen_US
dc.contributor.authorChen, Chi-Huangen_US
dc.contributor.authorWeng, Yung-Chingen_US
dc.date.accessioned2014-12-08T15:31:13Z-
dc.date.available2014-12-08T15:31:13Z-
dc.date.issued2013-07-01en_US
dc.identifier.issn2168-2194en_US
dc.identifier.urihttp://dx.doi.org/10.1109/JBHI.2013.2257815en_US
dc.identifier.urihttp://hdl.handle.net/11536/22232-
dc.description.abstractVarious researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.en_US
dc.language.isoen_USen_US
dc.subjectSemantic similarityen_US
dc.subjectsupport vector machineen_US
dc.subjectpage-count-baseden_US
dc.subjectcorpus-baseden_US
dc.subjectweb search engineen_US
dc.titleSemantic Similarity Measures in the Biomedical Domain by Leveraging a Web Search Engineen_US
dc.typeArticleen_US
dc.identifier.doi10.1109/JBHI.2013.2257815en_US
dc.identifier.journalIEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICSen_US
dc.citation.volume17en_US
dc.citation.issue4en_US
dc.citation.spage853en_US
dc.citation.epage861en_US
dc.contributor.department交大名義發表zh_TW
dc.contributor.department交大工研院聯合研發中心zh_TW
dc.contributor.departmentNational Chiao Tung Universityen_US
dc.contributor.departmentNCTU/ITRI Joint Research Centeren_US
dc.identifier.wosnumberWOS:000321218700013-
dc.citation.woscount0-
顯示於類別:期刊論文


文件中的檔案:

  1. 000321218700013.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。