標題: | 以知識本體為基礎之醫藥問答系統 Ontology-based Question Answering in Medicine |
作者: | 黃立泓 Li-Hong Huang 梁婷 Tyne Liang 資訊科學與工程研究所 |
關鍵字: | 問答系統;知識本體;醫藥;Question answering;Ontology;Medicine |
公開日期: | 2005 |
摘要: | 自動醫藥問答在處理問題時牽涉到知識本體的運用、問題分析與資訊擷取。近年來Unified Medical Language System (UMLS)大多被使用在醫藥領域上的知識查詢擴張,不同於以往專注在UMLS的查詢擴張研究,我們使用UMLS中概念的想法來萃取訓練語料中所產生的Concept-Verb-Concept樣本(CVC樣本),進而改善答案文本的排名。在問題分析方面,我們藉由Na□ve-Bayes分類器將問題分成四個類別,依序為:診斷、治療、病因和定義。問題類別在擷取相關答案文本上被視為一個重要的基準,並透過查詢擴張來增加答案文本的召回率,結合TF-IDF和CVC樣本的權重衡量將答案文本排名。從資料量為203個問題的實驗結果顯示,所提出的問答系統平均Mean Reciprocal Rank (MRR)值為0.63。 Automatic medical question answering involves the utilization of domain ontology, question analysis and information retrieval to process the medical question. Recently, Unified Medical Language System (UMLS) has been commonly utilized as the domain knowledge for medical query expansion. Unlike most previous researches focusing on UMLS as the domain expansion, we use the concepts in UMLS to extract Concept-Verb-Concept patterns (CVC patterns) from training corpus so as to improve the rank of answer texts. The proposed question analysis is to classify the questions into four categories based on Na□ve-Bayes classifier, namely: diagnosis, therapy, etiology, and definition. The category is a basis to retrieve the relevant answer texts from PubMed and query expansion is used to increase the recall for document retrieval. The answer texts are ranked by combining the weight of TF-IDF and CVC patterns. The experimental result with 203 questions shows that the proposed QA can yield 0.63 Mean Reciprocal Rank (MRR). |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT009323590 http://hdl.handle.net/11536/79119 |
顯示於類別: | 畢業論文 |