標題: 以兩階段探勘架構探索事故嚴重度之關鍵風險狀況
A two-stage mining framework to explore the key risk conditions on crash severity
作者: 陳文斌
Chen, Wen-Pin
邱裕鈞
藍武王
Chiou, Yu-Chiun
Lan, Lawrence W.
運輸與物流管理學系
關鍵字: 事故嚴重度;基因規則探勘;混合羅吉特模式;雙變量普羅比模式;逐步規則探勘演算法;類別變數;Crash Severity;Genetic Mining Rule;Mixed Logit Model;Bivariate Probit Model;Stepwise Rule-Mining Algorithm;Categorical Variable
公開日期: 2011
摘要: 有關交通事故資料的分析方法,多數研究採用統計模式以探究對於事故嚴重度之促成因素,例如羅吉斯迴歸、羅吉特模式及排序普羅比模式等,這些研究大多在統計模式中考慮所有的潛在因素,且個別檢定其顯著性。然而事故的發生未必是單一因素,經常是一連串的錯誤所造成,如未採用系統性分析架構,欲檢視所有潛在狀況幾乎是不可能的。基此,本研究提出一個兩階段分析架構,用以辨識並檢定事故嚴重度的風險狀況。第一階段使用基因規則探勘模式,以辨識可最能解釋嚴重程度的風險狀況;第二階段則將第一階段探勘而得的風險狀況設定為虛擬解釋變數,以事故嚴重度為被解釋變量,分別推估混合羅吉特模式(單一車輛事故)及雙變量普羅比模式(雙車事故)。 為了驗證本分析架構之實用性,以及判定事故嚴重度之風險狀況,本研究以臺灣地區高速公路事故資料作為實證分析。以單一車輛事故而言,2003至2007年臺灣地區高速公路事故調查報告的事故件數達5,563件,分成三類嚴重程度:A1類(死亡),A2類(受傷),與A3類(財物損失),以及21個解釋變數(風險因素):鋪面狀況、天候狀況、照明狀況、駕駛人性別、安全帶的使用、行動電話的使用等,這些變數皆為類別變數。在基因演算法的操作中,為了代表解釋變數與事故嚴重度的關係,每一條染色體代表一個潛在因果規則,其中構成前項部分至少包含一個變數,至多21個變數,在後項部分則僅包含事故嚴重程度一個變數。規則是一種知識型式的表示:若A則C,其中A是一組案例滿足預測屬性值的結合,而C則是一組案例滿足相同預測的後果。研究結果共探勘出29個規則,在訓練及預測資料組中分別達到75.10%及73.80%的準確度。將這29個探勘規則前項轉換成為虛擬解釋變數估計混合羅吉特模式(二階段方式),為了進行比較,另採用原始21個解釋變數估計混合羅吉特模式(一階段方式),結果發現兩階段混合羅吉特模式之績效比一階段混合羅吉特模式表現為佳,且可有效辨識關鍵風險狀況。針對各項關鍵風險狀況,本研究進一步研提相關改善對策。 延續單一車輛事故之分析架構,應用於雙車事故嚴重度之風險狀況時,本研究採用2008至2009年臺灣地區高速公路事故調查報告之雙車事故共計1,088件,惟在第二階段統計推估模式則改採雙變量普羅比模式(因迄今尚無雙變量混合羅吉特模式可茲運用)。研究結果亦顯示本研究所提出的兩階段分析架構確可有效判定雙車事故嚴重度之關鍵風險狀況,具應用價值與發展潛力。
Numerous studies employed statistical methods, such as logistic regression, Logit model, and ordered Probit model to investigate the contributing factors to crash severity. Most of these studies incorporate all potential factors into the statistical models and examine their significance of effect individually. However, it is well-known that crashes are often caused by a series of errors, instead of single one, but it is almost impossible to examine all potential conditions without a systematic method. Based on this, this study proposes a two-stage analytical framework to identify and test risk conditions to crash severity. The first stage is to develop a genetic mining rule (GMR) model to identify possible risk conditions which can best explain for the degree of severity. The second stage is then to estimate a mixed logit model (for one-vehicle crashes) and a bivariate Probit model (for two-vehicle crashes) by setting the minded risk conditions as dummy explanatory variables. To demonstrate the applicability of the proposed framework and to identify the risk conditions of one-vehicle crashes, a case study on Taiwan freeway crash accidents is conducted. A total of 5,563 crashes with three severity levels: A1 (fatality), A2 (injury), and A3 (property-damage only), are collected from 2003-2007 Taiwan’s freeway accident investigation reports. A total of 21 risk factors, such as surface condition, weather condition, lighting condition, driver gender, use of safety belt, use of cell phone, …etc. are considered and all of these variables are categorical. To represent the relationship between explanatory variables and crash severity, each chromosome is used to represent a potential if-then rule. The conditions associated in the “if part” are termed as antecedence part and those in the “then part” are named as consequent part. In addition, the antecedent part consists of at least one variable, but at most 21 variables. And the consequent part is composed by, of course, only one variable: severity level. In general, a rule is a knowledge representation of the form “If A Then C,” where A is a set of cases satisfying the conjunction of predicting attribute values and C is a set of cases with the same predicted degree. The mining result show that a total of 29 rules are mined which can achieve an overall correct prediction rate of 75.10% in training and 73.80% in validation, respectively. By setting a total of 29 mined rules (i.e. risk conditions) as dummy explanatory variables, a mixed logit (MXL) model is estimated. For comparisons, a MXL model with considering 21 original explanatory variables is also estimated. The estimated results show that the proposed two-stage MXL model performs better than one-stage MXL model in terms of likelihood ratio. In addition, the proposed model can successfully identify several key risk conditions. With the identified key risk conditions, corresponding countermeasures are then proposed. Follow the same vein, in the case study of two-vehicle crashes, a total of 1,088 crashes are collected from 2008-2009 Taiwan’s freeway accident investigation reports. In the second stage, the MXL model is replaced by the bivariate Probit (BP) model, since there is no available bivariate MXL model. The two-stage mining framework for two-car crashes has also successfully identified key risk conditions. These research results show that the proposed two-stage analytical framework is also promising for identifying the key risk conditions for two-vehicle crashes.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079436804
http://hdl.handle.net/11536/40887
顯示於類別:畢業論文