多變量偏斜分佈對於不完整資料之研究

Full metadata record

DC Field	Value	Language
dc.contributor.author	林資荃	en_US
dc.contributor.author	Lin Tzy-Chy	en_US
dc.contributor.author	陳鄰安	en_US
dc.contributor.author	林宗儀	en_US
dc.contributor.author	Chen, Lin-An	en_US
dc.contributor.author	Lin, Tsung-I	en_US
dc.date.accessioned	2014-12-12T02:47:25Z	-
dc.date.available	2014-12-12T02:47:25Z	-
dc.date.issued	2008	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT009226801	en_US
dc.identifier.uri	http://hdl.handle.net/11536/76897	-
dc.description.abstract	對於經常面對實際資料的研究者來言, 處理具複雜遺失形態的多變量資料之統計學習方法是一個非常重要課題。在許多應用的資料分析上, 為了數學上的便利, 研究者習慣地假設資料為常態分佈。然而, 當資料具有非典型或遠離中心的觀察值, 常態性假設將會不成立, 因而產生無效的推論。本篇論文包含三部份論說: 分別在多變量的偏斜常態和偏斜t分佈的架構下, 提供處理具異質性母體與遺失性資料的方法。第一部份論說中, 我們發展便利的計算的工具來分析具遺失訊息的混合多變量偏斜常態模型, 在隨機遺失機制下, 我們提供可解析的EM演算法, 用以處理模型的參數與遺失資料的監督學習。本文所提出的混合分析器,包含了高斯混合模式這個特例, 對於處理不完整高維度資料的從事者, 提供較寬廣的考慮層面。第二部份論說提出一些分析高維偏斜t模型的方法來處理資料同時具有厚尾, 不對稱與觀察值遺失現象。我們提出一個蒙地卡羅ECM演算法, 用來估計參數與填補遺失資料值。此外, 我們也發展一個有效率的資料擴增(data augmentation)演算法, 藉由多重填補法說明參數與遺失觀察值的不確定性。最後的論說中, 我們提供以混合多變量偏斜t模型為基準的方法, 用於不完整實驗資料的穩健性分類。我們也使用數個實際例子與模擬資料來闡述所提出的方法。	zh_TW
dc.description.abstract	Statistical learning of multivariate data with complex missing patterns is a very important issue for researchers who act on the real-life problems encountered in their own practice. In many applied data analyses, outcomes are routinely assumed to be normally distributed by practitioners for mathematical convenience. However, such a normality assumption is vulnerable to atypical or outlying observations and subsequently yields invalid inferences. My dissertation consists of three essays on the use of multivariate skew normal and skew t distributions to deal with data in the presence of population heterogeneity and possible missing values. In the first assay, we develop computationally flexible tools for the analysis of multivariate skew normal mixtures when missing values occur in data. Under missing at random mechanisms, we present an analytically feasible EM algorithm for the supervised learning of parameters as well as missing observations. The proposed mixture analyzer, including the most commonly used gaussian mixtures as a special case, allowing practitioners to handle incomplete multivariate data sets in a wide variety of considerations. The second assay presents some analytical devices for multivariate skew t models when fat-tailed, asymmetric and missing observations may simultaneously occur in the input data. We present a Monte Carlo version of the ECM algorithm, which is performed to estimate the parameters and retrieves the missing observation with a single imputation. Additionally, an efficient data augmentation scheme is developed to account for the uncertainty based on multiply imputed parameters and missing outcomes. In the last assay, we offer a multivariate skew t mixture-based approach to robust clustering for experimental data with incomplete observations. Several real data sets as well as simulations are illustrated in my dissertation	en_US
dc.language.iso	en_US	en_US
dc.subject	EM 演算法	zh_TW
dc.subject	多變量偏斜常態模型	zh_TW
dc.subject	多變量截斷常態分佈	zh_TW
dc.subject	資料擴增	zh_TW
dc.subject	MCECM演算法	zh_TW
dc.subject	隨機遺失	zh_TW
dc.subject	多變量偏斜t模型	zh_TW
dc.subject	多重填補	zh_TW
dc.subject	多變量截斷t分佈	zh_TW
dc.subject	EM algorithm,	en_US
dc.subject	MSN model	en_US
dc.subject	Multivariate truncated normal	en_US
dc.subject	data augmentation	en_US
dc.subject	MCECM algorithm	en_US
dc.subject	missing at random	en_US
dc.subject	MST model	en_US
dc.subject	multiple imputation	en_US
dc.subject	truncated t distribution	en_US
dc.title	多變量偏斜分佈對於不完整資料之研究	zh_TW
dc.title	Topics on Learning From Incomplete Data Using Multivariate Skew Distributions	en_US
dc.type	Thesis	en_US
dc.contributor.department	統計學研究所	zh_TW
Appears in Collections:	Thesis

Files in This Item:

680101.pdf

680102.pdf

If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.