完整後設資料紀錄
DC 欄位語言
dc.contributor.author劉姍澄zh_TW
dc.contributor.author劉敦仁zh_TW
dc.contributor.authorLiu, Shan-Chengen_US
dc.contributor.authorLiu, Duen-Renen_US
dc.date.accessioned2018-01-24T07:37:13Z-
dc.date.available2018-01-24T07:37:13Z-
dc.date.issued2016en_US
dc.identifier.urihttp://etd.lib.nctu.edu.tw/cdrfb3/record/nctu/#GT070363404en_US
dc.identifier.urihttp://hdl.handle.net/11536/139087-
dc.description.abstract資料收集、彙整與分析成為企業進行大數據分析需面臨的挑戰,傳統資料倉儲技術無法有效分析處理巨量資料,如何因應巨量資料及導入巨量資料倉儲之設計與建置,是企業因應大數據分析提昇競爭力之重要議題。 傳統資料倉儲擁有即時、複雜運算的查詢能力,而分散式運算適合儲存大量資料和非結構化資料。本研究藉助分散式檔案系統(HDFS)與Hive服務儲存大量歷史性資料,查詢與分析不同種類的儲存資料,並探討比較Hadoop –Yarn、Spark SQL、 及Drill之巨量資料查詢。 本研究以個案公司的生產製程資料庫及工單資料庫為資料來源,為簡化導入流程與節省轉換成本,保留原有企業資料倉儲架構,並另新導入巨量資料倉儲,讓使用者查詢資料倉儲系統時加快維運作業速度及正確性。本研究著重於巨量資料的整理、彙整,以及快速得到分析結果,提出巨量資料倉儲系統架構解決方案,研究成果可提供企業規劃導入巨量資料倉儲系統之參考。zh_TW
dc.description.abstractThe IT industry is facing with new challenges for big data analytics in data collection, aggregation, and analysis. Traditional data warehousing techniques cannot effectively process and analyze big data. Accordingly, to cope with big data analytics and promote competitive advantages, it is important for enterprises to design and deploy big data warehousing systems. Conventional data warehousing techniques are capable of real-time process and complex computation of queries, while distributed computation techniques are suitable for storing and processing large amount and unstructured data. This research uses Hadoop Distributed File System (HDFS) and Hive to store, query and analyze huge and various kinds of data. Hadoop-Yarn framework, Spark SQL and Drill are investigated and compared for querying big data. The production process data and work order data, which are collected from the case company, are used to deploy big data warehousing systems. To simplify the process of implementation and reduce cost in system transition, a big data warehousing system is deployed, while retaining the original data warehouse architecture of the case company. Moreover, users are benefited with higher operation speed and accuracy for data collection, aggregation and analysis. This research focuses on aggregating and analyzing big data, and proposes a resolution for deploying big data warehousing systems. The research result is provided as a reference model for enterprises to plan and deploy big data warehousing systems.en_US
dc.language.isozh_TWen_US
dc.subject資料倉儲zh_TW
dc.subject大數據zh_TW
dc.subjectYARNzh_TW
dc.subjectHDFSzh_TW
dc.subjectHivezh_TW
dc.subjectData Warehouseen_US
dc.subjectBig Dataen_US
dc.subjectYARNen_US
dc.subjectHDFSen_US
dc.subjectHiveen_US
dc.title導入巨量資料倉儲之設計與建置 - 以半導體封測公司為例zh_TW
dc.titleDesign and Deployment of Big Data Warehouses - A Case Study of a Packaging and Testing Companyen_US
dc.typeThesisen_US
dc.contributor.department管理學院資訊管理學程zh_TW
顯示於類別:畢業論文