hadoop training in hyderabad@kellytechnologies - PowerPoint PPT Presentation

About This Presentation

Title:

hadoop training in hyderabad@kellytechnologies

Description:

Hadoop Institutes : kelly technologies is the best Hadoop Training Institutes in Hyderabad. Providing Hadoop training by real time faculty in Hyderabad. www.kellytechno.com/Hyderabad/Course/Hadoop-Training – PowerPoint PPT presentation

Number of Views:51

Slides: 27

Provided by: kellytechnologies

Category: Other

Tags: hadoop_trainnig_institutes_in_hyderabad

more less

Transcript and Presenter's Notes

Title: hadoop training in hyderabad@kellytechnologies

1
Distributed Computing Overviews
Presented By
Kelly Technologies
www.kellytechno.com
2
Agenda

What is distributed computing
Why distributed computing
Common Architecture
Best Practice
Case study
Condor
Hadoop HDFS and map reduce

www.kellytechno.com
3
What is Distributed Computing/System?

Distributed computing
A field of computing science that studies
distributed system.
The use of distributed systems to solve
computational problems.
Distributed system
Wikipedia
There are several autonomous computational
entities, each of which has its own local memory.
The entities communicate with each other by
message passing.
Operating System Concept
The processors communicate with one another
through various communication lines, such as
high-speed buses or telephone lines.
Each processor has its own local memory.

www.kellytechno.com
4
What is Distributed Computing/System?

Distributed program
A computing program that runs in a distributed
system
Distributed programming
The process of writing distributed program

www.kellytechno.com
5
What is Distributed Computing/System?

Common properties
Fault tolerance
When one or some nodes fails, the whole system
can still work fine except performance.
Need to check the status of each node
Each node play partial role
Each computer has only a limited, incomplete view
of the system. Each computer may know only one
part of the input.
Resource sharing
Each user can share the computing power and
storage resource in the system with other users
Load Sharing
Dispatching several tasks to each nodes can help
share loading to the whole system.
Easy to expand
We expect to use few time when adding nodes. Hope
to spend no time if possible.

www.kellytechno.com
6
Why Distributed Computing?

The nature of application
Performance
Computing intensive
The task could consume a lot of time on
computing. For example, p
Data intensive
The task that deals with a lot mount or large
size of files. For example, Facebook, LHC(Large
Hadron Collider).
Robustness
No SPOF (Single Point Of Failure)
Other nodes can execute the same task executed on
failed node.

www.kellytechno.com
7
Common Architectures

Communicate and coordinate works among concurrent
processes
Processes communicate by sending/receiving
messages
Synchronous/Asynchronous

www.kellytechno.com
8
Common Architectures

Master/Slave architecture
Master/slave is a model of communication where
one device or process has unidirectional control
over one or more other devices
Database replication
Source database can be treated as a master and
the destination database can treated as a slave.
Client-server
web browsers and web servers

www.kellytechno.com
9
Common Architectures

Data-centric architecture
Using a standard, general-purpose relational
database management system ?? customized
in-memory or file-based data structures and
access method
Using dynamic, table-driven logic in ?? logic
embodied in previously compiled programs
Stored procedures ?? logic running in
middle-tier application servers
Shared databases as the basis for communicating
between parallel processes ?? direct
inter-process communication via message passing
function

www.kellytechno.com
10
Best Practice

Data Intensive or Computing Intensive
Data size and the amount of data
The attribute of data you consume
Computing intensive
We can move data to the nodes where we can
execute jobs
Data Intensive
We can separate/replicate data to difference
nodes, then we can execute our tasks on these
nodes
Reduce data replication when executing tasks
Master nodes need to know data location
No data loss when incidents happen
SAN (Storage Area Network)
Data replication on different nodes
Synchronization
When splitting tasks to different nodes, how can
we make sure these tasks are synchronized?

www.kellytechno.com
11
Best Practice

Robustness
Still safe when one or partial nodes fail
Need to recover when failed nodes are online. No
further or few action is needed
Condor restart daemon
Failure detection
When any nodes fails, master nodes can detect
this situation.
Eg Heartbeat detection
App/Users dont need to know if any partial
failure happens.
Restart tasks on other nodes for users

www.kellytechno.com
12
Best Practice

Network issue
Bandwidth
Need to think of bandwidth when copying files
from one node to other nodes if we would like to
execute the task on the nodes if no data in these
nodes.
Scalability
Easy to expand
Hadoop configuration modification and start
daemon
Optimization
What can we do if the performance of some nodes
is not good?
Monitoring the performance of each node
According to any information exchange like
heartbeat or log
Resume the same task on another nodes

www.kellytechno.com
13
Best Practice

App/User
shouldnt know how to communicate between nodes
User mobility user can access the system from
some point or anywhere
Grid UI (User interface)
Condor submit machine

www.kellytechno.com
14
Case study - Condor

Condor
Computing intensive jobs
Queuing policy
Match task and computing nodes
Resource Classification
Each resource can advertise its attributes and
master can classify according to this

www.kellytechno.com
15
Case study - Condor
www.kellytechno.com
16
Case study - Condor

Role
Central Manger
The collector of information, and the negotiator
between resources and resource requests
Execution machine
Responsible for executing condor tasks
Submit machine
Responsible for submitting condor tasks
Checkpoint servers
Responsible for storing all checkpoint files for
the tasks

www.kellytechno.com
17
Case study - Condor

Robustness
One execution machine fails
We can execute the same task on other nodes.
Recovery
Only need to restart the daemon when the failed
nodes are online

www.kellytechno.com
18
Case study - Condor

Resource sharing
Each condor user can share computing power with
other condor users.
Synchronization
Users need to take care by themselves
Users can execute MPI job in a condor pool but
need to think of the issues of synchronization
and Deadlock.
Failure detection
Central manager can know when nodes fails
Based on update notification sent by nodes
Scalability
Only execute few commands when new nodes are
online.

www.kellytechno.com
19
Case study - Hadoop

HDFS
Namenode
manages the file system namespace and regulates
access to files by clients.
determines the mapping of blocks to DataNodes.
Data Node
manage storage attached to the nodes that they
run on
save CRC codes
send heartbeat to namenode.
Each data is split as a chunk and each chuck is
stored on some data nodes.
Secondary Namenode
responsible for merging fsImage and EditLog

www.kellytechno.com
20
Case study - Hadoop
www.kellytechno.com
21
Case study - Hadoop

Map-reduce Framework
JobTracker
Responsible for dispatch job to each tasktracker
Job management like removing and scheduling.
TaskTracker
Responsible for executing job. Usually
tasktracker launch another JVM to execute the
job.

www.kellytechno.com
22
Case study - Hadoop
www.kellytechno.com
From Hadoop - The Definitive Guide
23
Case study - Hadoop

Data replication
Data are replicated to different nodes
Reduce the possibility of data loss
Data locality. Job will be sent to the node where
data are.
Robustness
One datanode fails
We can get data from other nodes.
One tasktracker failed
We can start the same task on different node
Recovery
Only need to restart the daemon when the failed
nodes are online

www.kellytechno.com
24
Case study - Hadoop

Resource sharing
Each hadoop user can share computing power and
storage space with other hadoop users.
Synchronization
No synchronization
Failure detection
Namenode/Jobtracker can know when
datanode/tasktracker fails
Based on heartbeat

www.kellytechno.com
25
Case study - Hadoop

Scalability
Only execute few commands when new nodes are
online.
Optimization
A speculative task is launched only when a task
takes too much time on one node.
The slower task will be killed when the other one
has been finished

www.kellytechno.com
26
Thank You
www.kellytechno.com

Write a Comment

User Comments (0)