An Introduction to Apache Hadoop MapReduce - PowerPoint PPT Presentation

About This Presentation
Title:

An Introduction to Apache Hadoop MapReduce

Description:

An Introduction to Apache Hadoop MapReduce, what is it and how does it work ? What is the map reduce cycle and how are jobs managed. Why should it be used and who are big users and providers ? – PowerPoint PPT presentation

Number of Views:1221
Slides: 10
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: An Introduction to Apache Hadoop MapReduce


1
Apache Hadoop MapReduce
  • What is it ?
  • Why use it ?
  • How does it work
  • Some examples
  • Big users

2
MapReduce What is it ?
  • Processing engine of Hadoop
  • Developers create Map and Reduce jobs
  • Used for big data batch processing
  • Parallel processing of huge data volumes
  • Fault tolerant
  • Scalable

3
MapReduce Why use it ?
  • Your data in Terabyte / Petabyte range
  • You have huge I/O
  • Hadoop framework takes care of
  • Job and task management
  • Failures
  • Storage
  • Replication
  • You just write Map and Reduce jobs

4
MapReduce How does it work ?
  • Take word counting as an example, something that
    Google does all of the time.

5
MapReduce How does it work ?
  • Input data split into shards
  • Split data mapped to key,value pairs i.e.
    Bear,1
  • Mapped data shuffled/sorted by key i.e. Bear
  • Sorted data reduced i.e. Bear, 2
  • Final data stored on HDFS
  • There might be extra map layer before shuffle
  • JobTracker controls all tasks in job
  • TaskTracker controls map and reduce

6
MapReduce - Some examples
  • A visual example with colours to show you the
    cycle
  • Split -gt Map -gt Shuffle -gt Reduce

7
MapReduce - Some examples
  • A visual example of MapReduce with job and task
    trackers added to individual map and reduce jobs.

8
Hadoop MapReduce Big users
  • Users
  • Facebook
  • Yahoo
  • Amazon
  • Ebay
  • Providers
  • Amazon
  • Cloudera
  • HortonWorks
  • MapR

9
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com