Hadoop Jobs and Tasks - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Hadoop Jobs and Tasks

Description:

Computes input split for the job. Splits cannot be computed(inputs does't exist), error is throw. ... mapred.task.tracker.expiry.interval ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 39
Provided by: Bin84
Category:
Tags: expiry | hadoop | jobs | tasks

less

Transcript and Presenter's Notes

Title: Hadoop Jobs and Tasks


1
Hadoop Jobs and Tasks
  • ReddyRaja

2
Brief Overview
2getnew Job ID
1run Job
MapReduce Program
JobClient
5Initialize Job
Job Tracker
4 Submit Job
Client JVM
Job Tracker Node
Client Node
7Heart Beat (returns task)
6 Retrieve Input Splits
3Copy Resources
TaskTracker
8 Retrieve Job Resources
Shared File System HDFS
9 Launch
10Run
Map Task or Reduce Task
Child
Task Tracker Node
3
Submit Job
  • Asks the Job Tracker for a new ID
  • Checks output spec of the Job. Checks o/p dir. If
    exists, throws error. Job is not sumitted
  • Computes input split for the job. Splits cannot
    be computed(inputs doest exist), error is throw.
    Job is not submitted
  • Copies the resources needed to run the job.
    Copies to Job Trackers file system, in a dir
    named after job id.
  • Job jar file. Copied with a high replication
    factor, factor of 10.Can be set by
    mapred.submit.replication property
  • Configuration file
  • Computed input splits
  • Tells the JobTracker.. Job is ready

4
Job Initialization
  • Puts the job in internal Queue
  • Job Scheduler will pickup and initialize it
  • Create a Job object and job being run
  • Encapsulate its tasks
  • Book keeping info to track tasks status and
    progress
  • Create list of tasks to run
  • Retrieves number of input splits computed by the
    JobClient from the shared filesystem
  • Creates one map task for split
  • Scheduler creates the Reduce tasks
  • No. of reduce tasks is determined by the
    map.reduce.tasks.
  • Tasks IDs are given for each task

5
Task Assignment
  • Task trackers send heartbeats to JobTracker
  • Task tracker indicates readines for a new task
  • Job Tracker will allocate a Task
  • Job Tracker communicates the task in a response
    to a heartbeat return
  • Choosing a Task Tracker
  • Job Tracker must choose a Task for a TaskTracker
  • Uses scheduler to choose a task from
  • Job Scheduling algorithms gtdefault one based on
    assigns priority

6
Task Assignment
  • Task trackers has fixed slots for map tasks and
    for reduce tasks
  • Task tracker may be able to run 2 map and 2
    reduce tasks simultaneously(does not depend on no
    of cores and amount of memory on the task
    tracker)
  • Scheduler fills the Map task slots before filling
    the reduce task slots
  • Job Tracker takes into account the task trackers
    network location and picks up a tasks, whose
    split is as close as possible to the task tracker
  • Ideal case would be to choose a task tracker
    node, where the split resides on. called
    data-local
  • Rack-local on the same rack, but not on the same
    node
  • Some tasks are neither data-local or rack local.
    Retrieves data from a different rack
  • Use counters to track how many data-local,
    rack-local or non local
  • Job tracker picks the next in its of yet-to-be
    run reduce tasks since there are no data locality
    considerations

7
Task Execution
  • Task tracker has been assigned the task
  • Next step is to run the task
  • Localizes the Job by copying the JAR file from
    the shared file system. Copies any other files
    required
  • Creates a local working dir for the task, un-jars
    the contents of the jar onto this dir
  • Creates an instance of TaskRunner to run the task
  • Task runner launches a new JVM to run each task
  • To avoid Task tracker to fail, if any bugs in
    MapReduce tasks
  • Only the child JVM exits in case of a problem

8
TaskExecution ..continued
9
Progress and Status Updates
  • MapReduce jobs are long running jobs
  • User needs feedback from time to time on the
    progress of the task
  • Job and tasks have the status
  • Running, successfully completed, failed
  • Progress of maps and reduces
  • Values of Job counters
  • Status messages and description
  • It can decide based on what phase it is running

10
Progress Reporting
  • Not 100 accurate
  • Nevertheless important to see Job running or not
  • Following operations constitute progress
  • Reading an input record
  • Writing an output record
  • Setting the status descriptor on a reporter
  • Incrementing a counter
  • Calling Reporters progress method
  • Tasks can also set counters
  • Framework built-in ones
  • User defined ones

11
Progress Reporting .. continued
  • Framework support
  • If progress flag is set, indicates status to be
    sent to the task tracker
  • Flag checked in a separate thread every 3 sec.
    TaskTracker is notified about the status
  • TaskTracker sends the same via heartbeats to the
    JobTracker every 5 sec
  • Status of all the tasks run by TaskTracker is
    sent
  • Status of Counters is sent less frequently to
    avoid congestion
  • Job Tracker combines these status reports
  • Gives a global view of all the Jobs and
    constituent tasks and their statuses
  • JobClient receives the status by polling the
    JobTracker every second
  • Client also can call getJobStatus to get the
    status information

12
Progress Reporting .. continued
13
Job Completion
  • Job Tracker receives notification that the last
    task of Job is complete
  • Changes the status to successful
  • JobClient polls for the status,
  • Returns message to the user and
  • Returns from the runJob method
  • JobTracker can also send HTTP Job notification
  • Can be configued by clients wishing to get
    notified via callbacks.
  • Clients can set job.end.notification.url
  • JobTracker cleans its working state for the Job
  • Also instructs the TaskTrackers to do the same

14
Tasks Failure
  • Causes
  • User code is buggy
  • Processes crash
  • Machines fail
  • Hadoop handles it quite smoothly

15
Tasks Failure .. continued
  • Child JVM reports the error back to the
    TaskTracker before exiting
  • Error logged into users logs
  • TaskTracker marks the Task and failed
  • Frees up slot for another task
  • Hanging tasks
  • TaskTracker senses that, it has not received any
    progress update
  • Proceeds to mark the staus as failed
  • Child JVM process is killed after the timeout
    period which is normally 10 mins
  • Can be configured on a per job basis
  • Setting a timeof zero, never frees up the
    hanging slot avoid this
  • Atleast send the progress update by setting the
    progress flag

16
Task failure .. continued
  • Task failed
  • Notified to the JobTracker
  • JobTracker will reschedule the execution of the
    task
  • Avoids scheduling on a TaskTracker where is has
    failed earlier
  • Will try 4 times before giving up
  • mapred.map.max.attempts for map tasks
  • mapred.reduce.max.attempts for reduce tasks
  • If any task fails more than 4 times, the job is
    set to failed, regardless of how many times it
    was tried
  • Can be changed by setting
  • mapred.max.map.failures.percent
  • mapred.max.reduce.failures.percent
  • Task can also be killed in case of a speculative
    task
  • Killed tasks do not count for no. of failed tasks

17
TaskTracker Failure
  • Symptoms
  • Fails to sent Heartbeats
  • Might have crashed or
  • Running slowly
  • JobTracker will mark it as failed and removes it
    from pool of tasktrackers to be scheduled on
  • Heart beat misses for 10 mins or
  • mapred.task.tracker.expiry.interval
  • JobTracker arranges for the tasks to run on
    different TaskTracker for all the
    successful/failed for incomplete Jobs
  • Any tasks in progress are also rescheduled
  • JobTracker can also blaklist a TaskTracker
  • If the no of tasks failed is significantly higher
    than average rate of failure rate on the cluster
  • Blaklisted ones can be restarted to remove from
    the jobtrackers list

18
Job Scheduling
  • Simple
  • Ran in the order of submission using FIFO
    scheduler
  • Fair Scheduler
  • Capacity Scheduler

19
Shuffle and Sort
  • MapReduce framework gaurantees that the input to
    every reducer is sorted by key
  • Process by which system performs Sort is Sort
    Phase
  • Transfers the map outputs to the reducers as
    inputs called Shuffle phase
  • Shuffle code base keeps changing and continuous
    improvements are made
  • Shuffle is the heart of MapReduce

20
Shuffle and Sort ..continued
21
Shuffle and Sort ..continued
  • Map
  • Circular Memory
  • Map blocks writing, if the buffer is full Dsd
  • Another thread starts writing to the disk after
    reaching a threshold ..80
  • Map outputs will continue to write into the disk
  • Before writing to disk, the thread partitions
    based on Reducer is has to go to
  • Within each thread, in-memory sort is performed
    and
  • A combiner function is run on the output of the
    sort
  • Several spill files are created
  • Spills are merged and partitioned and sorted to
    an output file
  • Combiner is run before the output file is written
  • Data written to the disk can be compressed

22
Shuffle and Sort ..continued
  • Reduce
  • Needs the map output from several mappers
  • Copy phase copied the mappers data to Reduce
    phase
  • Smaller no of copier thread to copy parallely

23
Shuffle and Sort ..continued
  • How the reducers know where to get the map
    outputs from
  • Tasks notify TaskTracker about map being
    completed
  • TaskTracker sends the update to JobTracker
  • JobTracker knows for a given job, the mapper
    outputs and the TaskTrackers they are available
  • Reducers asks this information from JobTracker
    periodically until is has retrieved them all
  • Task Trackers do not delete mapoutput from disk
    till the Job is completed
  • Reducer Task may fail
  • Wait until told to do so from JobTracker

24
Task Execution
  • Speculative Execution
  • Task JVM Reuse
  • Skipping Bad records
  • Task Execution environment
  • Counters
  • Sorting
  • Secondary Sort
  • Joins
  • Side data distribution

25
Speculative Execution
  • Tasks are run in parallel
  • A slow task can make the whole job significantly
    longer
  • Out of few thousand tasks, some jobs could be
    straggling
  • Hadoop tries to find slow running tasks
  • Hadoop creates backup tasks when a slow running
    task is expected
  • After the task is ran successfully, any copy of
    the tasks are killed
  • It is an optimization technique. If the task is
    designed to run slow, this may not work
  • Can be turned on or off

26
Task JVM Reuse
  • Hadoop runs tasks in their own JVM
  • When JVM Reuse is enabled,
  • Tasks in the Child JVM are run sequentially
  • Task Tracker still runs the tasks parallely
  • Tasks from different Jobs are always run on
    different Child JVMs
  • Mapred.job.reuse.jvm.run.tasks
  • -1 indicates no limit.

27
Skipping Bad Records
  • Large data sets could have corrupt records
  • They often have missing fields
  • In practice, the code should ignore these records
  • Bad records have to handled in Mapper or Reducer.
  • Ignore the records
  • TextInputFormat has a feature to set the length
    of the record.
  • Corrupted records usually have long lengths

28
Task Execution Environment
  • Hadoop provides environment to tasks
  • Several properties can be accessed from
    JobConfiguration
  • Task files
  • Multiple instances of the same task
  • Should not write into the same file
  • If task failed and is not retired, the old output
    file is still present
  • Speculative Execution Two instances of the same
    task could write into same file?
  • Solution
  • Hadoop writes the file into a temp dir, specific
    to the task attempt.
  • mapred.output.dir/_temporary/mapred.task.id/
  • On successful completion, file is written to the
    mapred.output.dir

29
Counters
  • Counters are used to gather statistics about the
    Job
  • Quality controls (good vs bad records)
  • Application Level statistics
  • Problem diagnosis
  • Counters are easier to retieve compared to log
    outputs
  • Built in Counters
  • Input records, bytes
  • Output records bytes etc

30
User Defined Counters
31
User Defined Counters
  • Counters are grouped Enum Names
  • Fields are the counter names
  • Dynamic counters for storing values
  • Readable Counter Names
  • Using Resource Bundle
  • Air Temperature Records instead of
  • Temperature.MISSING

32
Retrieving Counters
  • Counters can be retrieved as follows

33
Sorting
  • By default Keys are sorted before sent to the
    Reduce Task
  • Sort order for keys is controlled by
  • Property mapred.output.key.comparator.class
  • Keys must be a subclass of WritableComparable
  • Partitioned MapFileLookup
  • If MapFileOutputFormat is used, lookup by keys
    can be done

34
Secondary Sort
  • MapReduce sorts record by keys
  • Values are not sorted
  • Use following strategy to get the values sorted
  • Use Composite key(have value portion)
  • KeyComparator orders by Composite key

35
Joins
  • MapReduce can perform joins of large sets
  • Use frameworks such as PIG, HIVE or Cascading to
    achieve a Join
  • Map Joins
  • Use CompositeInputFormat
  • Allows Join to be performed before passing to Map
  • Reduce Joins
  • Key as the join mechanism
  • Multiple Inputs
  • Use different mappers. Map Output to be same

36
Side data distribution
  • Extra read only data needed by MapReduce Jobs
  • The challenge is to make this data available to
    Map Reduce Jobs
  • Cache Side data in a static field
  • Use Job Configuration
  • Overide the config method
  • To pass Objects, use DefualtStringifier(Hadoop
    serialization)
  • Do not use it to transfer more than 1 kb

37
Side Data distribution - continued
  • Distributed Cache
  • Copy files and archives once per Job to the task
    node
  • Make them available to the MapReduce functions
  • --files and archives options
  • Files can be local or in HDFS system
  • Hadoop other args -files input/ncdc/metadata/st
    ations-fixed-width.txt

38
Side Data How it works
  • When the Job is launched, Hadoop copies the files
    specified by the files options to the
    JobTrackers file system to a local disk the
    cache
  • From the tasks point of view, the files are just
    there
  • Reference count of no of tasks using the file is
    maintained, on zero, the file is eligible for
    deletion
  • Files are deleted if the cache size exceeds 10
    GB, making way for other jobs
  • Files are localized under (mapred.local.dir)/task
    Tracker/archi dir on task trackers
  • Apps can use the file as it is. Files are
    symbolically linked to a working dir
Write a Comment
User Comments (0)
About PowerShow.com