標題: 相加模型下藉由單獨的單一核甘酸多形性關係探測其交互作用的趨勢
Detecting Interaction Patterns Based on Single SNP
作者: 許庭瑋
Hsu, Ting-Wei
盧鴻興
Lu, Horng-Shing
統計學研究所
關鍵字: Loss of power;expectation-conditional maximization;genome-wide association study;single nucleotide polymorphism;additive model;hypertension;Loss of power;expectation-conditional maximization;genome-wide association study;single nucleotide polymorphism;additive model;hypertension
公開日期: 2008
摘要: 此篇論文包含了兩個部分,針對相加模型下藉由單獨的單一核甘酸多形性(SNP)關係探測其交互作用的趨勢,而我們方法的重點在於檢定力的損失與節省運算時間的權衡。 在GWAS探討交互作用關係的運算時間是相當驚人的,我們首先找出單獨SNP關係與配對SNP關係的關聯,希望透過損失一些檢定力,使得運算時間能大幅降低。研究中的第二部分是利用條件最大期望值 (ECM)來估計在實際資料中的λ_AB (基因型AB的相對外顯率)、f_A (對偶基因A的頻率)、f_B (對偶基因B的頻率),並且可藉由估計值來計算檢定力的損失。 型一誤差(α)與型二誤差(β)之拉扯乃統計假設檢定中著名的問題,然而,在GWAS中做多重檢定,5x10^(-7)或1x10^(-5)這類的型一誤差是相當常見的,如此一來檢定力(1-β)由於型二誤差很大而變得非常差。換句話說,當使用很小的型一誤差時,會使得假設檢定的結果過於保守。 利用此方法來分析WTCCC所提供之高血壓的資料,我們偵測到已有文獻提及與高血壓有關的一些基因或SNP,諸如CHRM2 (rs7800093), KCNB2 (rs11782342), HTR3B (rs17116117), rs2820037, GAB1 (rs300916, rs300915, rs300913), BCAT1 (rs7961152, rs11613673, rs12424348), MYBPC1 (rs11110912)。然而也有一些是至今尚未發現的,如rs825148, rs1553460, LOC100129858 (rs6840033), rs4131463, RPL18P4 (rs1528356), rs17797701, OTOG (rs11024327), rs10843660, CHST11 (rs11112069), SIP1 (rs8011855), RHOJ (rs1957779)這些值得將來繼續深入研究的基因或SNP。
This thesis consists of two main parts for detecting interaction patterns based on single nucleotide polymorphism (SNP) association under additive model. Our approach is focused on the trade-off between loss of power and the reduction in computation time. The computation time for interaction association in genome-wide association study (GWAS) is usually tremendous. Our first task is to find the relation between single SNP association and paired SNPs association such that computation time could be greatly reduced through some loss of power. In the second research area, expectation-conditional maximization (ECM) algorithm is used to estimate λ_AB (relative penetrance rate for genotype AB), f_A (allele frequency A), f_B (allele frequency B) in real genome-wide association study, and consequently provide reasonable parameters for estimating the loss of power. The trade-off for α (type I error) and β (type II error) is well-known in statistical hypothesis testing. However, a small α such as 5x10^(-7); 1x10^(-5) are used often in case-control association study since in multiple testing, the power (1-β) will be badly weakened due to large β. In other words, a small α makes hypothesis testing over-conservative. Analyzing data with this approach, which imitates WTCCC of hypertension, we have detected parts of known genes or SNPs, such as CHRM2 (rs7800093), KCNB2 (rs11782342), HTR3B (rs17116117), rs2820037, GAB1 (rs300916, rs300915, rs300913), BCAT1 (rs7961152, rs11613673, rs12424348), MYBPC1 (rs11110912). Nevertheless, we have also detected unknowns, such as rs825148, rs1553460, LOC100129858 (rs6840033), rs4131463, RPL18P4 (rs1528356), rs17797701, OTOG (rs11024327), rs10843660, CHST11 (rs11112069), SIP1 (rs8011855), RHOJ (rs1957779) which are worthy of digging for statistical replication and biological experiments in the future.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079626520
http://hdl.handle.net/11536/42681
顯示於類別:畢業論文


文件中的檔案:

  1. 652001.pdf

若為 zip 檔案,請下載檔案解壓縮後,用瀏覽器開啟資料夾中的 index.html 瀏覽全文。