SlideShare a Scribd company logo
Hadoop Map/Reduce

    Owen O’Malley
      July 2006
Map/Reduce Goals
– Distribution
   • The data is available where needed.
   • Application does not care how many computers
     are being used.
– Reliability
   • Application does not care that computers or
     networks may have temporary or permanent
     failures.


                                                   2
Application Perspective
• Define Mapper and Reducer classes and a
  “launching” program.
• Mapper
  – Is given a stream of key1,value1 pairs
  – Generates a stream of key2, value2 pairs
• Reducer
  – Is given a key2 and a stream of value2’s
  – Generates a stream of key3, value3 pairs
• Launching Program
  – Creates a JobConf to define a job.
  – Submits JobConf to JobTracker and waits for
    completion.                                   3
Application Dataflow




                       4
Input & Output Formats
• The application also chooses input and output
  formats, which define how the persistent data
  is read and written. These are interfaces and
  can be defined by the application.
• InputFormat
  – Splits the input to determine the input to each map
    task.
  – Defines a RecordReader that reads key, value
    pairs that are passed to the map task
• OutputFormat
  – Given the key, value pairs and a filename, writes
    the reduce task output to persistent store.
                                                        5
Output Ordering
• The application can control the sort order and
  partitions of the output via
  OutputKeyComparator and Partitioner.
• OutputKeyComparator
   – Defines how to compare serialized keys.
   – Defaults to WritableComparable, but should be
     defined for any application defined key types.
      • key1.compareTo(key2)
• Partitioner
   – Given a map output key and the number of
     reduces, chooses a reduce.
   – Defaults to HashPartitioner
                                                      6
      • key.hashCode % numReduces
Combiners
• Combiners are an optimization for jobs with
  reducers that can merge multiple values into
  a single value.
• Typically, the combiner is the same as the
  reducer and runs on the map outputs before it
  is transferred to the reducer’s machine.
• For example, WordCount’s mapper generates
  (word, count) and the combiner and reducer
  generate the sum for each word.
  – Input: “hi Owen bye Owen”
  – Map output: (“hi”, 1), (“Owen”, 1), (“bye”,1), (“Owen”,1)
  – Combiner output: (“Owen”, 2), (“bye”, 1), (“hi”, 1)         7
Process Communication
• Use a custom RPC implementation
  –   Easy to change/extend
  –   Defined as Java interfaces
  –   Server objects implement the interface
  –   Client proxy objects automatically created
• All messages originate at the client
  – Prevents cycles and therefore deadlocks
• Errors
  – Include timeouts and communication problems.
  – Are signaled to client via IOException.
  – Are NEVER signaled to the server.
                                                   8
Map/Reduce Processes
• Launching Application
  – User application code
  – Submits a specific kind of Map/Reduce job
• JobTracker
  – Handles all jobs
  – Makes all scheduling decisions
• TaskTracker
  – Manager for all tasks on a given node
• Task
  – Runs an individual map or reduce fragment for a
    given job
  – Forks from the TaskTracker
                                                      9
Process Diagram




                  10
Job Control Flow
• Application launcher creates and submits job.
• JobTracker initializes job, creates FileSplits,
  and adds tasks to queue.
• TaskTrackers ask for a new map or reduce
  task every 10 seconds or when the previous
  task finishes.
• As tasks run, the TaskTracker reports status
  to the JobTracker every 10 seconds.
• When job completes, the JobTracker tells the
  TaskTrackers to delete temporary files.
• Application launcher notices job completion
  and stops waiting.                              11
Application Launcher
• Application code to create JobConf and set
  the parameters.
  – Mapper, Reducer classes
  – InputFormat and OutputFormat classes
  – Combiner class, if desired
• Writes JobConf and the application jar to DFS
  and submits job to JobTracker.
• Can exit immediately or wait for the job to
  complete or fail.

                                               12
JobTracker
• Takes JobConf and creates an instance of
  the InputFormat. Calls the getSplits method to
  generate map inputs.
• Creates a JobInProgress object and a bunch
  of TaskInProgress “TIP” and Task objects.
  – JobInProgress is the status of the job.
  – TaskInProgress is the status of a fragment of
    work.
  – Task is an attempt to do a TIP.
• As TaskTrackers request work, they are given
  Tasks to execute.                          13
TaskTracker
• All Tasks
  –   Create the TaskRunner
  –   Copy the job.jar and job.xml from DFS.
  –   Localize the JobConf for this Task.
  –   Call task.prepare() (details later)
  –   Launch the Task in a new JVM under
      TaskTracker.Child.
  –   Catch output from Task and log it at the info level.
  –   Take Task status updates and send to JobTracker
      every 10 seconds.
  –   If job is killed, kill the task.
  –   If task dies or completes, tell the JobTracker.    14
TaskTracker for Reduces
• For Reduces, the task.prepare() fetches all of
  the relevant map outputs for this reduce.
• Files are fetched using http from the map’s
  TaskTracker’s Jetty.
• Files are fetched in parallel threads, but only
  1 to each host.
• When fetches fail, a backoff scheme is used
  to keep from overloading TaskTrackers.
• Fetching accounts for the first 33% of the
  reduce’s progress.
                                                15
Map Tasks
• Use the InputFormat object to create a
  RecordReader from the FileSplit.
• Loop through the keys and values in the
  FileSplit and feed each to the mapper.
• For no combiner, a SequenceFile is written
  for the keys to each reduce.
• With a combiner, the frameworks buffers
  100,000 keys and values, sorts, combines,
  and writes them to SequenceFile’s for each
  reduce.
                                               16
Reduce Tasks: Sort
• Sort
  – 33% to 66% of reduce’s progress
  – Base
     • Read 100 (io.sort.mb) meg of keys and values into
       memory.
     • Sort the memory
     • Write to disk
  – Merge
     • Read 10 (io.sort.factor) files and do a merge into 1 file.
     • Repeat as many times as required (2 levels for 100 files,
       3 levels for 1000 files, etc.)

                                                                17
Reduce Tasks: Reduce
• Reduce
  – 66% to 100% of reduce’s progress
  – Use a SequenceFile.Reader to read sorted input
    and pass to reducer one key at a time along with
    the associated values.
  – Output keys and values are written to the
    OutputFormat object, which usually writes a file to
    DFS.
  – The output from the reduce is NOT resorted, so it
    is in the order and fragmentation of the map output
    keys.
                                                     18

More Related Content

Hadoop Map Reduce Arch

  • 1. Hadoop Map/Reduce Owen O’Malley July 2006
  • 2. Map/Reduce Goals – Distribution • The data is available where needed. • Application does not care how many computers are being used. – Reliability • Application does not care that computers or networks may have temporary or permanent failures. 2
  • 3. Application Perspective • Define Mapper and Reducer classes and a “launching” program. • Mapper – Is given a stream of key1,value1 pairs – Generates a stream of key2, value2 pairs • Reducer – Is given a key2 and a stream of value2’s – Generates a stream of key3, value3 pairs • Launching Program – Creates a JobConf to define a job. – Submits JobConf to JobTracker and waits for completion. 3
  • 5. Input & Output Formats • The application also chooses input and output formats, which define how the persistent data is read and written. These are interfaces and can be defined by the application. • InputFormat – Splits the input to determine the input to each map task. – Defines a RecordReader that reads key, value pairs that are passed to the map task • OutputFormat – Given the key, value pairs and a filename, writes the reduce task output to persistent store. 5
  • 6. Output Ordering • The application can control the sort order and partitions of the output via OutputKeyComparator and Partitioner. • OutputKeyComparator – Defines how to compare serialized keys. – Defaults to WritableComparable, but should be defined for any application defined key types. • key1.compareTo(key2) • Partitioner – Given a map output key and the number of reduces, chooses a reduce. – Defaults to HashPartitioner 6 • key.hashCode % numReduces
  • 7. Combiners • Combiners are an optimization for jobs with reducers that can merge multiple values into a single value. • Typically, the combiner is the same as the reducer and runs on the map outputs before it is transferred to the reducer’s machine. • For example, WordCount’s mapper generates (word, count) and the combiner and reducer generate the sum for each word. – Input: “hi Owen bye Owen” – Map output: (“hi”, 1), (“Owen”, 1), (“bye”,1), (“Owen”,1) – Combiner output: (“Owen”, 2), (“bye”, 1), (“hi”, 1) 7
  • 8. Process Communication • Use a custom RPC implementation – Easy to change/extend – Defined as Java interfaces – Server objects implement the interface – Client proxy objects automatically created • All messages originate at the client – Prevents cycles and therefore deadlocks • Errors – Include timeouts and communication problems. – Are signaled to client via IOException. – Are NEVER signaled to the server. 8
  • 9. Map/Reduce Processes • Launching Application – User application code – Submits a specific kind of Map/Reduce job • JobTracker – Handles all jobs – Makes all scheduling decisions • TaskTracker – Manager for all tasks on a given node • Task – Runs an individual map or reduce fragment for a given job – Forks from the TaskTracker 9
  • 11. Job Control Flow • Application launcher creates and submits job. • JobTracker initializes job, creates FileSplits, and adds tasks to queue. • TaskTrackers ask for a new map or reduce task every 10 seconds or when the previous task finishes. • As tasks run, the TaskTracker reports status to the JobTracker every 10 seconds. • When job completes, the JobTracker tells the TaskTrackers to delete temporary files. • Application launcher notices job completion and stops waiting. 11
  • 12. Application Launcher • Application code to create JobConf and set the parameters. – Mapper, Reducer classes – InputFormat and OutputFormat classes – Combiner class, if desired • Writes JobConf and the application jar to DFS and submits job to JobTracker. • Can exit immediately or wait for the job to complete or fail. 12
  • 13. JobTracker • Takes JobConf and creates an instance of the InputFormat. Calls the getSplits method to generate map inputs. • Creates a JobInProgress object and a bunch of TaskInProgress “TIP” and Task objects. – JobInProgress is the status of the job. – TaskInProgress is the status of a fragment of work. – Task is an attempt to do a TIP. • As TaskTrackers request work, they are given Tasks to execute. 13
  • 14. TaskTracker • All Tasks – Create the TaskRunner – Copy the job.jar and job.xml from DFS. – Localize the JobConf for this Task. – Call task.prepare() (details later) – Launch the Task in a new JVM under TaskTracker.Child. – Catch output from Task and log it at the info level. – Take Task status updates and send to JobTracker every 10 seconds. – If job is killed, kill the task. – If task dies or completes, tell the JobTracker. 14
  • 15. TaskTracker for Reduces • For Reduces, the task.prepare() fetches all of the relevant map outputs for this reduce. • Files are fetched using http from the map’s TaskTracker’s Jetty. • Files are fetched in parallel threads, but only 1 to each host. • When fetches fail, a backoff scheme is used to keep from overloading TaskTrackers. • Fetching accounts for the first 33% of the reduce’s progress. 15
  • 16. Map Tasks • Use the InputFormat object to create a RecordReader from the FileSplit. • Loop through the keys and values in the FileSplit and feed each to the mapper. • For no combiner, a SequenceFile is written for the keys to each reduce. • With a combiner, the frameworks buffers 100,000 keys and values, sorts, combines, and writes them to SequenceFile’s for each reduce. 16
  • 17. Reduce Tasks: Sort • Sort – 33% to 66% of reduce’s progress – Base • Read 100 (io.sort.mb) meg of keys and values into memory. • Sort the memory • Write to disk – Merge • Read 10 (io.sort.factor) files and do a merge into 1 file. • Repeat as many times as required (2 levels for 100 files, 3 levels for 1000 files, etc.) 17
  • 18. Reduce Tasks: Reduce • Reduce – 66% to 100% of reduce’s progress – Use a SequenceFile.Reader to read sorted input and pass to reducer one key at a time along with the associated values. – Output keys and values are written to the OutputFormat object, which usually writes a file to DFS. – The output from the reduce is NOT resorted, so it is in the order and fragmentation of the map output keys. 18