PPT by Ravi Namboori Network Architect on Hadoop & HDFS Architecture PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: PPT by Ravi Namboori Network Architect on Hadoop & HDFS Architecture


1
Hadoop HDFS Architecture
  • Presentation By
  • Ravi Namboori
  • Visit us http//ravinamboori.net

2
What is HDFS?
  • Hadoop comes with a distributed filesystem called
    HDFS, which stands for Hadoop distributed
    filesystem.

http//ravinamboori.net
3
What actually a distributed filesystem?
  • When a dataset outgrows the storage capacity of a
    single physical machine, it becomes necessary to
    partition it across a number of separate
    machines.
  • Filesystems that manage the storage across a
    network of machines are called distributed
    filesystems.

http//ravinamboori.net
4
Design of HDFS
  • Very large files
  • Streaming data access
  • Commodity hardware

http//ravinamboori.net
5
Where HDFS Is Not A Good Fit
  • Low-latency data access
  • Lots of small files
  • Multiple writers, arbitrary file modifications

http//ravinamboori.net
6
HDFS Architecture
An HDFS cluster consists of a single NameNode, a
master server that manages the file system
namespace and regulates access to files by
clients. It also consist of secondary namenode.
It updates the namespace image with datalog.
http//ravinamboori.net
7
Why Is a Block in HDFS So Large?
  • HDFS blocks are large compared to disk blocks,
    and the reason is to minimize the cost of seeks.
  • By making a block large enough, the time to
    transfer the data from the disk can be made to be
    significantly larger than the time to seek to the
    start of the block.
  • Thus the time to transfer a large file made of
    multiple blocks operates at the disk transfer
    rate.

8
Advantage of HDFS?
Moving Computation is Cheaper than Moving Data"
  • A computation requested by an application is much
    more efficient if it is executed near the data it
    operates on. This is especially true when the
    size of the data set is huge. HDFS provides
    interfaces for applications to move themselves
    closer to where the data is located.

http//ravinamboori.net
9
Thank You
  • Presentation By
  • Ravi Namboori
  • Visit us http//ravinamboori.in
Write a Comment
User Comments (0)
About PowerShow.com