Analysis of Hadoop Distributed Environment for Data Storage and Data Computing

碩士 === 國立交通大學 === 電信工程研究所 === 100 === The primary issue of this thesis is the architecture of HDFS(Hadoop Distributed File System)and Hadoop MapReduce software framework for distributed computing. Distributed file system is a file system that allowed many computers to share their files and storage s...

Full description

Bibliographic Details
Main Author: 王耀駿
Other Authors: 張文鐘
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/22814669427908852751
id ndltd-TW-100NCTU5435023
record_format oai_dc
spelling ndltd-TW-100NCTU54350232015-10-13T20:37:27Z http://ndltd.ncl.edu.tw/handle/22814669427908852751 Analysis of Hadoop Distributed Environment for Data Storage and Data Computing Hadoop分散式資料儲存與計算環境的架構分析 王耀駿 碩士 國立交通大學 電信工程研究所 100 The primary issue of this thesis is the architecture of HDFS(Hadoop Distributed File System)and Hadoop MapReduce software framework for distributed computing. Distributed file system is a file system that allowed many computers to share their files and storage spaces through the network, and distributed computing is a way to solve large computational problems in parallel by collecting distributed computing resources. HDFS and MapReduce framework are running on the same computer cluster, HDFS provide file system service and store large data sets in disks of computer cluster, while MapReduce applications process large data sets stored in HDFS in-parallel on large cluster. Both HDFS and MapReduce framework follow master/slave architecture, a cluster consists of a single master server and many slave servers. The master server is responsible for managing and coordinating the storage and computing resource provided by slave servers in cluster to serve the requests from clients. All servers are fully connected and communicate with each other by using TCP-based protocols and streaming mechanism. The mechanism of HDFS and MapReduce framework would be verified through the studying of relative open source code, and a computer cluster would be set up to further clarify the operations. 張文鐘 2011 學位論文 ; thesis 203 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電信工程研究所 === 100 === The primary issue of this thesis is the architecture of HDFS(Hadoop Distributed File System)and Hadoop MapReduce software framework for distributed computing. Distributed file system is a file system that allowed many computers to share their files and storage spaces through the network, and distributed computing is a way to solve large computational problems in parallel by collecting distributed computing resources. HDFS and MapReduce framework are running on the same computer cluster, HDFS provide file system service and store large data sets in disks of computer cluster, while MapReduce applications process large data sets stored in HDFS in-parallel on large cluster. Both HDFS and MapReduce framework follow master/slave architecture, a cluster consists of a single master server and many slave servers. The master server is responsible for managing and coordinating the storage and computing resource provided by slave servers in cluster to serve the requests from clients. All servers are fully connected and communicate with each other by using TCP-based protocols and streaming mechanism. The mechanism of HDFS and MapReduce framework would be verified through the studying of relative open source code, and a computer cluster would be set up to further clarify the operations.
author2 張文鐘
author_facet 張文鐘
王耀駿
author 王耀駿
spellingShingle 王耀駿
Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
author_sort 王耀駿
title Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
title_short Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
title_full Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
title_fullStr Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
title_full_unstemmed Analysis of Hadoop Distributed Environment for Data Storage and Data Computing
title_sort analysis of hadoop distributed environment for data storage and data computing
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/22814669427908852751
work_keys_str_mv AT wángyàojùn analysisofhadoopdistributedenvironmentfordatastorageanddatacomputing
AT wángyàojùn hadoopfēnsànshìzīliàochǔcúnyǔjìsuànhuánjìngdejiàgòufēnxī
_version_ 1718050013182951424