Hadoop Course Online

About This Presentation

Title:

Hadoop Course Online

Description:

Hadoop is an open source framework that uses Java programming. Hadoop is primarily used to process and store the large set of data in a distributing computing environment. – PowerPoint PPT presentation

Number of Views:40

Slides: 27

Provided by: Chorlatte

Category:

more less

Transcript and Presenter's Notes

Title: Hadoop Course Online

1
Hadoop Course Online
B E S T O N L I N E C O U R S E S C O U P O N . C
O M
2
The Contents
About Hadoop
01
Types of data comes under big data
02
Benefits of big data
03
Solution to process big data
04
3
The Contents
Hadoop architecture
05
MapReduce
06
Hadoop Distributed File System
07
Working with Hadoop
08
4
The Contents
Advantages of Hadoop
09
Hadoop Environment setup
10
Overview of HDFS
11
Features of HDFS
12
5
The Contents
Architecture of HDFS
13
Operations of HDFS
14
Hadoop MapReduce
15
Big Data And Hadoop For Beginners
16
6
About Hadoop
Hadoop is an open source software framework
which allows the user to store and process a
large amount of data. Hadoop consists of computer
clusters which are built from commodity
software. Hadoop framework was designed by Apache
software foundation, and it was originally
released on 25th Dec 2011. The storage section is
often referred as Hadoop Distributed File System
(HDFS), and the processing part is performed by
using MapReduce programming model.
7
Types of data comes under big data
The big data has data generated by various
applications and devices. Different types of data
which come under the category of big data black
box data, social media data, power grid data,
search engine data, transport data, stock
exchange data.
8
Benefits of big data
In hospitals, big data plays a vital role, and
the data analytics are storing the patients
medical history using big data technologies This
will help the doctors to provide quick service
to the patients. The main challenges of big data
are capturing data, storage, curation,
searching, transfer, sharing, analysis and
presentation.
9
Solution to process big data
If we have the small amount of data, we can
store and process the data using the traditional
approach. In this approach, the data is typically
stored in RDBMS like MS SQL Server, Oracle
database, etc. While dealing with huge amount of
data, it is not possible for storing and
processing the data using the traditional
database server.
10
Hadoop architecture
Hadoop framework has four modules such as Hadoop
YARN, Hadoop common, Hadoop MapReduce, and
Hadoop Distributed File System (HDFS).
11
MapReduce
Hadoop MapReduce is a software framework which
is used to write applications for processing the
vast amount of data with the help of thousands
of nodes.
12
Hadoop Distributed File System
HDFS provides the distributed file system that
is used to run massive clusters of small
computer machines in fault tolerant and reliable
manner. This distributed system is based on the
Google file system (GFS).
13
Working with Hadoop
The application or user submits the job to the
Hadoop client. Then the Hadoop client sends the
job and its configuration to the job tracker
which is responsible for splitting and
distributing the configuration to the
slaves. Job tracker is also responsible for
scheduling the works and monitored them only
then it can provide the status to the job
client. After completing this process, the task
trackers on different nodes perform the job
using MapReduce algorithm, and finally, the
output files are stored in the file system.
14
Advantages of Hadoop
Since Hadoop is an open source framework, so it
is compatible with all the platforms. We can add
or remove the servers dynamically. This process
doesnt interrupt the Hadoop in any way. The
users can write and test the distributed systems
quickly using Hadoop framework.
15
Hadoop Environment setup
Linux operating system supports the Hadoop
framework. The Linux users can easily setup the
Hadoop environment on their computers. Before
starting to install the Hadoop, the users have
to setup the Linux using Secure Shell (SSH). If
the users have the OS other than Linux, then
they need to install the software called
Virtualbox that have the Linux OS inside it.
16
Overview of HDFS

HDFS stores huge amount of data and provides
easier access. This distributed file system is
fault tolerant, and it is designed with low-cost
hardwares.

17
Features of HDFS
It provides file authentication
and permissions. Interaction with HDFS is
possible using common interface system which is
provided by Hadoop. HDFS is perfectly suitable
for distributed storage and processing purposes.
18
Architecture of HDFS
Hadoop distributed file system follows
the master-slave architecture which has the
following components such as namenode, datanode,
and block. Intention of HDFS Process the large
datasets efficiently Fault detection and
recovery Hardware at data
19
Operations of HDFS
Firstly the users have to format the configured
HDFS file system and start the distributed file
system. Then listing files in HDFS which means
loading information into the server. After that,
the users have to insert data into Hadoop
Distributed File System. Retrieve the data from
HDFS. Finally shut down the HDFS.
20
Operations of HDFS
Firstly the users have to format the configured
HDFS file system and start the distributed file
system. Then listing files in HDFS which means
loading information into the server. After that,
the users have to insert data into Hadoop
Distributed File System. Retrieve the data from
HDFS. Finally shut down the HDFS.
21
Hadoop MapReduce
MapReduce is used to process the enormous amount
of data, and it is otherwise known as processing
technique. This algorithm performs two tasks to
process the data completely. The works include
the map and reduce. Here map is used to convert
a set of data into another set of data.
22
Hadoop MapReduce
The individual elements are splitting into
tuples. Then the output of the map is taken as
the input by the reduce task which combines the
data tuples into the smaller set of
tuples. MapReduce program executes the process
in three stages comprises of map stage, shuffle
stage and reduce stage.
23
Big Data And Hadoop For Beginners
Beginner, Students, Manager, and Developer, can
take this course if you are interested in
learning Big Data. This 3 hours hadoop course
online has six sections with 1 article and six
supplemental resources. The prime motto of this
course is making you understand the Hadoop
components and its complex architectures.
24
Hadoop Course Online 2017
Linkable link
Hadoop Tutorial Learn Big Data Learn Hadoop And
MapReduce For Big Data Problems Big Data And
Hadoop For Beginners
HadoopCourseOnline
25
Follow us
BESTONLINECOURSESCOUPON
_at_BEST_COURSESS
BESTCOURSES
BESTONLINECOURSESCOUPON.COM
26
THANKS FOR YOUR TIME!
B E S T O N L I N E C O U R S E S C O U P O N . C
O M

Write a Comment

User Comments (0)