標題: Openstack與Hadoop整合與研究
An integration and application of OpenStack and Hadoop
作者: 方智誼
Fang, Chih-Yi
蔡錫鈞
Tsai,Shi-Chun
資訊科學與工程研究所
關鍵字: 雲端計算;OpenStack;Hadoop;Ceph;cloud computing;OpenStack;Hadoop;Ceph
公開日期: 2013
摘要: 近年來網際網路應用的發展,使得雲端計算變得是一項熱門的技術,透過使用雲端的資源,來達到各種不同的網路應用,而Amazon Elastic MapReduce(EMR)是目前知名的商業雲端計算服務,可以透過服務建立出各種不同規模的Hadoop Cluster以進行Hadoop MapReduce的計算,可用來進行多種類型的應用,如:Log分析、網站索引、資料倉儲、機器學習...等,因此將如何使用雲端資源將會是值得研究的課題。 目前Amazon Elastic MapReduce雖然提供了一個整合Hadoop服務的使用方式,但對於已既有設備的公司或學校單位,要想架設此類型的服務仍尚缺軟體上的支援,因此本篇論文最主要的目的是透過各種不同的OpenSouce整合,利用OpenStack、Hadoop、Ceph來提供一個類似於EMR服務的平台出來,以提供大家可以透過介面的使用,可以很迅速的在OpenStack上架設出Hadoop Cluster,並且也能透過介面來進行Hadoop Cluster的管理與操作,在Cluster中也整合了Object Storage的儲存服務,可用來提供給Hadoop MapReduce計算使用,本篇論文已透過本校的OpenStack平台上完成架設,並且進行測試實驗以檢驗Instance與Object Storage的服務品質,除此之外透過本篇論文的設計方式,也可將介面程式碼進行OpenSource,使得其他人可取得程式碼後進行客製化的修改與使用。
Cloud computing has become a popular IT technology in recent years. Applications can be built to provide elastic services through network. Amazon is the most famous vender to provide cloud services. Elastic MapReduce (EMR) is one of the most popular services provided by Amazon, where users can rent and build scalable Hadoop cluster. The cluster can be used in various of applications, such as log analysis, web indexing, data warehousing, machine learning, etc. In this thesis, based on the open sources OpenStack and Hadoop, we integrate both technologies and provide an EMR service on OpenStack. Users can set up and run a Hadoop cluster via the GUI interface. Usually, Hadoop program is handled with command line approach. Along the way, (1) we use the Ceph technology to support object storage, such that our Hadoop cluster can process really big data and the results can be stored even after the cluster is released; (2) we use S3 storage for the cluster communication and authentication; (3) we have modified the Dashboard of OpenStack to provide better user interface. Based on 4 servers, our experimental results show that our platform can process data up to 1T within few hours. The capacity can be further improved with hardware upgraded and we leave it as a future work.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070056060
http://hdl.handle.net/11536/73869
顯示於類別:畢業論文