Full metadata record
DC FieldValueLanguage
dc.contributor.author方智誼en_US
dc.contributor.authorFang, Chih-Yien_US
dc.contributor.author蔡錫鈞en_US
dc.contributor.authorTsai,Shi-Chunen_US
dc.date.accessioned2014-12-12T02:39:10Z-
dc.date.available2014-12-12T02:39:10Z-
dc.date.issued2013en_US
dc.identifier.urihttp://140.113.39.130/cdrfb3/record/nctu/#GT070056060en_US
dc.identifier.urihttp://hdl.handle.net/11536/73869-
dc.description.abstract近年來網際網路應用的發展,使得雲端計算變得是一項熱門的技術,透過使用雲端的資源,來達到各種不同的網路應用,而Amazon Elastic MapReduce(EMR)是目前知名的商業雲端計算服務,可以透過服務建立出各種不同規模的Hadoop Cluster以進行Hadoop MapReduce的計算,可用來進行多種類型的應用,如:Log分析、網站索引、資料倉儲、機器學習...等,因此將如何使用雲端資源將會是值得研究的課題。 目前Amazon Elastic MapReduce雖然提供了一個整合Hadoop服務的使用方式,但對於已既有設備的公司或學校單位,要想架設此類型的服務仍尚缺軟體上的支援,因此本篇論文最主要的目的是透過各種不同的OpenSouce整合,利用OpenStack、Hadoop、Ceph來提供一個類似於EMR服務的平台出來,以提供大家可以透過介面的使用,可以很迅速的在OpenStack上架設出Hadoop Cluster,並且也能透過介面來進行Hadoop Cluster的管理與操作,在Cluster中也整合了Object Storage的儲存服務,可用來提供給Hadoop MapReduce計算使用,本篇論文已透過本校的OpenStack平台上完成架設,並且進行測試實驗以檢驗Instance與Object Storage的服務品質,除此之外透過本篇論文的設計方式,也可將介面程式碼進行OpenSource,使得其他人可取得程式碼後進行客製化的修改與使用。zh_TW
dc.description.abstractCloud computing has become a popular IT technology in recent years. Applications can be built to provide elastic services through network. Amazon is the most famous vender to provide cloud services. Elastic MapReduce (EMR) is one of the most popular services provided by Amazon, where users can rent and build scalable Hadoop cluster. The cluster can be used in various of applications, such as log analysis, web indexing, data warehousing, machine learning, etc. In this thesis, based on the open sources OpenStack and Hadoop, we integrate both technologies and provide an EMR service on OpenStack. Users can set up and run a Hadoop cluster via the GUI interface. Usually, Hadoop program is handled with command line approach. Along the way, (1) we use the Ceph technology to support object storage, such that our Hadoop cluster can process really big data and the results can be stored even after the cluster is released; (2) we use S3 storage for the cluster communication and authentication; (3) we have modified the Dashboard of OpenStack to provide better user interface. Based on 4 servers, our experimental results show that our platform can process data up to 1T within few hours. The capacity can be further improved with hardware upgraded and we leave it as a future work.en_US
dc.language.isoen_USen_US
dc.subject雲端計算zh_TW
dc.subjectOpenStackzh_TW
dc.subjectHadoopzh_TW
dc.subjectCephzh_TW
dc.subjectcloud computingen_US
dc.subjectOpenStacken_US
dc.subjectHadoopen_US
dc.subjectCephen_US
dc.titleOpenstack與Hadoop整合與研究zh_TW
dc.titleAn integration and application of OpenStack and Hadoopen_US
dc.typeThesisen_US
dc.contributor.department資訊科學與工程研究所zh_TW
Appears in Collections:Thesis