Machine Learning-Based Configuration Parameter Tuning on Hadoop System

doi:10.1109/BigDataCongress.2015.64

Full metadata record

DC Field	Value	Language
dc.contributor.author	Chen, Chi-Ou	en_US
dc.contributor.author	Zhuo, Ye-Qi	en_US
dc.contributor.author	Yeh, Chao-Chun	en_US
dc.contributor.author	Lin, Che-Min	en_US
dc.contributor.author	Liao, Shih-wei	en_US
dc.date.accessioned	2017-04-21T06:48:18Z	-
dc.date.available	2017-04-21T06:48:18Z	-
dc.date.issued	2015	en_US
dc.identifier.isbn	978-1-4673-7278-7	en_US
dc.identifier.uri	http://dx.doi.org/10.1109/BigDataCongress.2015.64	en_US
dc.identifier.uri	http://hdl.handle.net/11536/136033	-
dc.description.abstract	Apache Hadoop system is a software framework with the capability to process large-scale datasets across a cluster of distributed machines using MapReduce programming model. However, there are two main challenges for system administrators to manage the Hadoop system; (1) system administrators are difficult to tune the parameters appropriately since the behaviors and characteristics of large-scale distributed systems are too complicated; (2) there are dozens of configuration parameters affecting the system performance which makes the configuration parameters tuning task becomes troublesome. In this paper, we focus on optimizing the Hadoop MapReduce job performance by tuning configuration parameters, and then we propose an analytical method to help system administrators choose approximately optimal configuration parameters depending on the characteristics of each application. Our approach has two key phases: prediction and optimization phase. The prediction phase is to estimate the performance of a MapReduce job, whereas the optimization phase is to search the approximately optimal configuration parameters strategically by invoking the predictor repeatedly. In our evaluation results, our work can help system administrators to improve the performance about 2X to 8X better than traditional methods.	en_US
dc.language.iso	en_US	en_US
dc.subject	Distributed System	en_US
dc.subject	Machine Learning	en_US
dc.subject	Optimization Problem	en_US
dc.title	Machine Learning-Based Configuration Parameter Tuning on Hadoop System	en_US
dc.type	Proceedings Paper	en_US
dc.identifier.doi	10.1109/BigDataCongress.2015.64	en_US
dc.identifier.journal	2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015	en_US
dc.citation.spage	386	en_US
dc.citation.epage	392	en_US
dc.contributor.department	資訊工程學系	zh_TW
dc.contributor.department	Department of Computer Science	en_US
dc.identifier.wosnumber	WOS:000380443700054	en_US
dc.citation.woscount	0	en_US
Appears in Collections:	Conferences Paper