標題: | 多輸入感測啟動子的數學建模 Mathematical Modeling for Multi-Input-Sensing Promoters |
作者: | 邱譯玄 Chiu, I-Hsuan 何信瑩 陳文亮 Ho, Shinn-Ying Chen, Wen-Liang 生物資訊及系統生物研究所 |
關鍵字: | 模型;合成生物學;啟動子;轉錄因子;誘導物;model;synthetic biology;promoter;transcription factor;inducer |
公開日期: | 2012 |
摘要: | 合成生物學是近年來熱門的研究領域。合成生物學家透過基因重組技術,使人工設計之基因迴路可在細菌細胞中表達特定的、可預期之功能。在構成基因迴路的基本生物元件中,包含了啟動子、核糖體結合位、蛋白質表現序列與終止子等,其中啟動子是開始基因轉錄所必須。依照合成生物學家的需要,有時採用的啟動子元件上面需含有多個轉錄因子結合位,以同時整合數個轉錄因子濃度之輸入訊號,來決定所調控的基因以多大的表現速率作輸出,我們將上述啟動子稱之為「多輸入感知啟動子」。由於精準地調節基因表現是合成生物學家的重要課題,因此透過一個數學公式來描述轉錄因子(或轉錄因子誘導物)濃度之輸入與基因表現速率之輸出的關係是很重要的。
近年來有一些相關研究,有做數學建模並以實驗驗證。這些研究主要針對兩種轉錄因子的共同調控,而對於三種轉錄因子以上的共同調控目前尚未有相關研究。根據數學模型,這些研究可分為兩大類型:第一類是數學公式簡單、具備一般性的模型;第二類是為特定啟動子所設計、較複雜的模型。就生物實驗驗證模型而言,這些研究使用經過最佳化的參數,去逼近實驗數據,卻沒有進一步的獨立測試,因此無法確認對於新產生的基因表現速率之輸出,模型是否仍能精準反映實驗結果。
在本篇研究中,我們提出了一個改良的數學模型,可描述轉錄因子誘導物之三維輸入與基因表現速率輸出的關係。這個模型含有14個具有生物意義的參數,可反映出各種轉錄因子結合狀態下的基因表現速率,以及希爾函數曲線的特徵。
為了驗證我們的模型對於一組基因表現速率能產生一組穩定的參數解,在做生物實驗之前,我們先做了兩個數值模擬實驗。在第一個數值模擬實驗中,我們指定了64個誘導物濃度組合和模型中的14個目標參數數值來產生64個模擬的基因表現速率數據。假定除了14個目標參數數值之外,我們知道上述所有訊息,再以參數最佳化工具去解這些參數數值。實驗結果指出,在30組參數解當中,大部分的參數解為穩定解且接近目標參數數值。第一名參數解的模擬誤差值(fitting error)為1.29E-07,再以第一名的參數解做獨立測試,其模擬誤差值為5.67E-07。
在第二個數值模擬實驗中,我們的目標在於模擬生物實驗所產生的偏差,並檢驗在含有數據偏差之狀況下的參數解的表現。為了達成此目標,我們將第一個實驗的基因表現速率數據添加5%以內的干擾。實驗結果指出,在30組參數解當中,大部分的參數解仍為穩定解且接近目標參數數值。第一名之參數解的模擬誤差值為3.00E-03,再以第一名的參數解做獨立測試,其模擬誤差值為5.89E-03。
接下來為了驗證模型對於真實實驗數據能夠有精準地模擬,我們使用了iGEM(International Genetically Engineered Machine)所提供之生物元件,組裝了一個具備三種轉錄因子結合位的啟動子,並以64種轉錄因子之濃度組合得出64個綠色螢光蛋白之表現速率。以基因演算法最佳化參數後發現,在30組參數解當中,11個參數為穩定解。第一名之參數解的模擬誤差值為4.88E-03,再以第一名的參數解做獨立測試,其模擬誤差值為1.78E-02。
總結以上,我們的數學模型可模擬出逼近真實實驗的基因表現速率,並可以預測新產生的基因表現速率。最後,我們展望我們的數學模型對於合成生物學家有實際的貢獻,例如輔助設計多輸入感知啟動子,並為這些啟動子的輸入-輸出之關係作數學建模。再者,我們也預期廣泛地使用多輸入感知啟動子可以拓展合成生物學的應用。 Synthetic biology is a hot researching field recently. By using gene recombination techniques, synthetic biologists can implant pre-designed genetic circuits into bacterial cells, and manipulate the cells to perform specific tasks. Gene circuits are composed of some basic biological parts (or BioBricks), including promoters, ribosome binding sites, protein coding sequences, terminators and so on. Among these parts, promoters are necessary for gene transcription. Depends on the needs, synthetic biologists may choose a promoter which contains multiple transcription factor binding sites (TFBSs). This type of promoters can integrate multiple transcription factor (TF) concentration inputs, and output a gene expression at a specific rate. We defined this type of promoters as “multi-input sensing promoters”. Owing to the fact that fine-tuning of gene expressions is an important topic for synthetic biologists, using a mathematical model to describe the relationship between TF (or TF’s inducer) concentrations inputs and gene expression rates is needed. There are some researches using modeling and experimental validation methods. However, these researches only focused on the co-regulation by two TFs, and still not concerned about the co-regulation by three or more TFs. Based on the model, these researches can be divided by two classes: The first class is simple, general model; the second class is complex, problem-dependent model. For experimental validations, these researches optimized the parameters for best fitting; however, these researches did not perform independent tests; therefore, for newly generated expression rate outputs, whether these models reach the same performances are not confirmed. In this research, we proposed a reformed mathematical model which can describe the relationship between three inducer inputs and gene expression rate outputs. This model contains 14 biological meaning parameters which can reflect the expression rates under different TF binding states and the features of Hill function curves. In order to validate that our model can generate a robust set of parameter solution to a set of gene expression data, we performed two numerical experiments before biological experiments. In the first numerical experiment, we assigned 64 inducer concentration combinations and 14 target parameter values in our model to generate 64 simulated gene expression rate data. Assuming we knew all the information given above except 14 target parameter values, we used a parameter optimization tool to solve these values. Our result showed that among 30 runs, most of the solutions were robust and close to target parameter values. The fitting error of the top one solution was 1.29E-07. We also used the top one solution for independent test, and the fitting error was 5.67E-07. In the second numerical experiment, we aimed to simulate the deviation which generated from biological experiments and test the parameter solving performances under these deviations. To achive this goal, we added noises in a degree within 5% to the gene expression rate data in the first numerical experiment. The result showed that among 30 runs, most of the solutions were still robust and close to target parameter values. The fitting error of the top one solution was 3.00E-03, and the fitting error of top one solution in independent tests was 5.89E-03. In order to validate that our model can precisely simulate the real expression data, we used BioBricks from iGEM (International Genetically Engineered Machine) to construct a promoter which contains TFBSs for three kinds of TFs. By using 64 sets of inducer concentration combinations, we acquired 64 GFP expression rates. After parameter optimization, 11 parameters were robust among 30 runs. The fitting error of top one solution is 4.88E-03, and the fitting error of top one solution in independent tests was 1.78E-02. In conclusion, our mathematical model could fit the real expression data and predict newly generated outputs. We prospect that our model would contribute to synthetic biologists. For example, helps synthetic biologists designing the multi-input sensing promoters, and modeling the input-output relationships of these promoters. Furthermore, we also prospect that the extensive usage of multi-input sensing promoters would enlarge the applications in synthetic biology. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT070057203 http://hdl.handle.net/11536/72666 |
Appears in Collections: | Thesis |