Bayesian exploratory clustering with entropy Chinese restaurant process

doi:10.3233/IDA-163332

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	Liu, Chien-Liang	en_US
dc.contributor.author	Hsaio, Wen-Hoar	en_US
dc.contributor.author	Lin, Che-Yuan	en_US
dc.date.accessioned	2018-08-21T05:53:39Z	-
dc.date.available	2018-08-21T05:53:39Z	-
dc.date.issued	2018-01-01	en_US
dc.identifier.issn	1088-467X	en_US
dc.identifier.uri	http://dx.doi.org/10.3233/IDA-163332	en_US
dc.identifier.uri	http://hdl.handle.net/11536/144979	-
dc.description.abstract	Data exploration is essential to data analytics, especially when one is confronted with massive datasets. Clustering is a commonly used technique in data exploration, since it can automatically group data instances into a list of meaningful categories, and capture the natural structure of data. Traditional finite mixture model requires the number of clusters to be specified in advance of analyzing the data, and this parameter is crucial to the clustering performance. Chinese restaurant process (CRP) mixture model provides an alternative to this problem, allowing the model complexity to grow as more data instances are observed. Although CRP provides the flexibility to create a new cluster for subsequent data instances, one still has to determine the hyperparameter of the prior and the parameters for the base distribution in the likelihood part. This work proposes a non-parametric clustering algorithm based on CRP with two main differences. First, we propose to create a new cluster based on entropy of the posterior, whereas the CRP uses a hyperparameter to control the probability of creating a new cluster. Second, we propose to dynamically adjust the parameters of the base distribution according to the mean of the observed data owing to Chebyshev's inequality. Additionally, detailed derivation and update rules are provided to perform posterior inference with the proposed collapsed Gibbs sampling algorithm. The experimental results indicate that the proposed algorithm avoids to specify the number of clusters and works well on several datasets.	en_US
dc.language.iso	en_US	en_US
dc.subject	Non-parametric model	en_US
dc.subject	exploratory learning	en_US
dc.subject	entropy	en_US
dc.subject	Chinese restaurant process	en_US
dc.subject	clustering	en_US
dc.title	Bayesian exploratory clustering with entropy Chinese restaurant process	en_US
dc.type	Article	en_US
dc.identifier.doi	10.3233/IDA-163332	en_US
dc.identifier.journal	INTELLIGENT DATA ANALYSIS	en_US
dc.citation.volume	22	en_US
dc.citation.spage	551	en_US
dc.citation.epage	568	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	工業工程與管理學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.contributor.department	Department of Industrial Engineering and Management	en_US
dc.identifier.wosnumber	WOS:000432011700006	en_US
顯示於類別：	期刊論文