標題: 藉由建立基因演化與抗原性漂移之關聯性預測A型H3N2流行性感冒病毒之抗原性變異
Predicting Antigenic Variants of Influenza A H3N2 Viruses by Building Relationships between Genetic Evolution and Antigenic Drift
作者: 黃章維
Jhang-Wei Huang
楊進木
Jinn-Moon Yang
生物資訊及系統生物研究所
關鍵字: 流感;抗原性;Influenza;Antigenic
公開日期: 2005
摘要: 具有疾病性之禽類與人類流行性感冒病毒曾對人類文明社會帶來嚴重的傷害與經濟損失,因此了解流感病毒之抗原性演化對於預防流感與疫苗株之挑選是很重要的議題。大多數的相關研究在預測抗原性演化與預測未來造成流行之病毒株時只統計位於紅血球凝集素(HA)上之突變點數與使用演化式之分析方法。近來有幾份研究發現位於血球凝集素上之突變點數量與抗原-抗體親和力有關聯性,換句話說,發現了基因演化與抗原性演化之關聯性。此發現顯示抗原性演化比基因演化更具有不連續之跳躍性,且基因序列上的改變有時會造成不等價之鉅大抗原性影響。 在這份論文中,我們研究的重要議題是“位於HA的序列中,那些重要位置的改變會與HI滴定量改變有高度的相關性”。資訊獲得量被用來衡量並且代表基因演化與抗原性演化之關聯性。位於HA序列上之一個胺基酸位置若具有高的資訊獲得量則表示發生在此位置上之點突變會與代表抗原性特性之血球凝集抑制抗體效價高度相關。此顯示了每個位置的資訊獲得量可以用來預測HA序列上之基因改變與抗原性改變之相關性。決策樹方法(C4.5)根據資訊獲得量被用來選擇21個重要的位置。這21個位置被進一步分成6群,每一群內高度相關之位置具有共同演化之特性。根據每個位置之資訊獲得量與共同演化之資訊,在研究中建立了一個模組來預測基因演化與抗原性演化之關聯性。 我們的方法分別使用序列上之特徵值與結構上之特徵值(Contact Map),此兩者在訓練模組之預測率分別91%與96%。此方法在同一組資料集上之預測率比傳統使用漢明距離法具有較高的預測率。大部分我們找到重要的位置都落在Epitope上並且與之前的相關研究有一致性。 最後該預測模組(使用資訊獲得量所選擇之重要位置)被應用於2個測試資料上。對於WER之50筆疫苗株資料之預測率為74%,對於5928筆歷史資料之預測率為87%並且能成功地預測流感病毒群體間之轉移(99%)。由以上的結果,顯示我們的方法具有robust之特性並且有助於預測基因與抗原性演化之關聯性,此方法亦具潛力助於疫苗發展。
Pathogenic avian and human influenza virus could cause disastrous damage to human society and economics. Understanding antigenic evolution of influenza viruses is a very important issue for vaccine strain selection and prophylaxis. To predict antigenic drift most current approaches use only hemagglutinin protein (HA) sequences of influenza by number of mutations and phylogenetic analyses to select viruses which will probably be the progenitor of viruses in the next epidemic. Recently, several reports had indicated that there were relationships between mutations of HA protein sequences and antigen-antibody affinity, i.e., the relationships between the viral genetic evolution and antigenic drift. They observed that antigenic drift was more punctuated than genetic evolution, and genetic changes sometime had a disproportionately large antigenic effect. In this thesis, we study an important issue: “whether certain amino acid positions change in the HA protein sequences are correlated to the change of binding HI titer values”. The information gain is used to calculate the degree of association between the genetic evolution and antigenic drift. An amino acid with high information gain at a specific position (i.e., 1 ~ 329 positions for a HA sequence) means that amino acid mutation on this position is highly correlated to antigenic change on HI titer value. This implied that the value of information gain in each position is able to predict the association between genetic and antigenic change for HA sequences. Here, a decision tree tool (C 4.5) was used to select 21 important positions based on information gain. These 21 positions are further clustered into 6 groups and the amino acid positions on the same cluster are high co-evolution. According to the information gain of each position and co-evolution, we have built a model to predict the association between the genetic and antigenic evolution. Our method yielded both sequence features (position-specific amino acid changes) and structure features (contact maps). The accuracies of our model were 91% and 96% by using sequence and structure features, respectively. The accuracy is much better than a traditional hamming distance method on the same data set. Most of the immunodominant positions identified by our method are located on the epitope sites and are consistent with previous works. Finally, the predicted model (positions selected by information gain) was applied on two test sets. The predicting accuracy for 50 cases from WER vaccine strains was 74% and for 5928 historical real cases was 87%. These results demonstrate that our approach is robust and useful for predicting the relationship between genetic evolution and antigenic drift and is potential useful for vaccine development.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009351502
http://hdl.handle.net/11536/79854
顯示於類別:畢業論文


文件中的檔案:

  1. 150201.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。