SlideShare a Scribd company logo
The Next Generation of Hadoop Map-ReduceSharad Agarwalsharadag@yahoo-inc.comsharad@apache.org
About MeHadoop Committer and PMC memberArchitect at Yahoo!
Hadoop Map-Reduce TodayJobTrackerManages cluster resources and job schedulingTaskTrackerPer-node agentManage tasks
Current LimitationsScalabilityMaximum Cluster size – 4,000 nodesMaximum concurrent tasks – 40,000Coarse synchronization in JobTrackerSingle point of failure	Failure kills all queued and running jobsJobs need to be re-submitted by usersRestart is very tricky due to complex stateHard partition of resources into map and reduce slots
Current LimitationsLacks support for alternate paradigmsIterative applications implemented using Map-Reduce are 10x slower. Example: K-Means, PageRankLack of wire-compatible protocols Client and cluster must be of same versionApplications and workflows cannot migrate to different clusters
Next Generation Map-Reduce RequirementsReliabilityAvailabilityScalability - Clusters of 6,000 machinesEach machine with 16 cores, 48G RAM, 24TB disks100,000 concurrent tasks10,000 concurrent jobsWire CompatibilityAgility & Evolution – Ability for customers to control upgrades to the grid software stack.
Next Generation Map-Reduce – Design CentreSplit up the two major functions of JobTrackerCluster resource managementApplication life-cycle managementMap-Reduce becomes user-land library
Architecture
ArchitectureResource ManagerGlobal resource schedulerHierarchical queuesNode ManagerPer-machine agentManages the life-cycle of containerContainer resource monitoringApplication MasterPer-applicationManages application scheduling and task executionE.g. Map-Reduce Application Master
 Improvements vis-à-vis current Map-ReduceScalability Application life-cycle management is very expensivePartition resource management and application life-cycle managementApplication management is distributedHardware trends - Currently run clusters of 4,000 machines6,000 2012 machines > 12,000 2009 machines<8 cores, 16G, 4TB> v/s <16+ cores, 48/96G, 24TB>
 Improvements vis-à-vis current Map-ReduceAvailability Application MasterOptional failover via application-specific checkpointMap-Reduce applications pick up where they left offResource ManagerNo single point of failure - failover via ZooKeeperApplication Masters are restarted automatically
 Improvements vis-à-vis current Map-ReduceWire Compatibility Protocols are wire-compatibleOld clients can talk to new serversRolling upgrades
 Improvements vis-à-vis current Map-ReduceAgility / Evolution Map-Reduce now becomes a user-land libraryMultiple versions of Map-Reduce can run in the same cluster (ala Apache Pig)Faster deployment cycles for improvementsCustomers upgrade Map-Reduce versions on their schedule
 Improvements vis-à-vis current Map-ReduceUtilizationGeneric resource model MemoryCPUDisk b/wNetwork b/wRemove fixed partition of map and reduce slots
 Improvements vis-à-vis current Map-ReduceSupport for programming paradigms other than Map-ReduceMPIMaster-WorkerMachine LearningIterative processingEnabled by allowing use of paradigm-specific Application MasterRun all on the same Hadoop cluster
SummaryThe next generation of Map-Reduce takes Hadoop to the next levelScale-out even furtherHigh availabilityCluster Utilization Support for paradigms other than Map-Reduce
Questions?

More Related Content

Apache Hadoop India Summit 2011 talk "The Next Generation of Hadoop MapReduce" by Sharad Agrawal

  • 1. The Next Generation of Hadoop Map-ReduceSharad Agarwalsharadag@yahoo-inc.comsharad@apache.org
  • 2. About MeHadoop Committer and PMC memberArchitect at Yahoo!
  • 3. Hadoop Map-Reduce TodayJobTrackerManages cluster resources and job schedulingTaskTrackerPer-node agentManage tasks
  • 4. Current LimitationsScalabilityMaximum Cluster size – 4,000 nodesMaximum concurrent tasks – 40,000Coarse synchronization in JobTrackerSingle point of failure Failure kills all queued and running jobsJobs need to be re-submitted by usersRestart is very tricky due to complex stateHard partition of resources into map and reduce slots
  • 5. Current LimitationsLacks support for alternate paradigmsIterative applications implemented using Map-Reduce are 10x slower. Example: K-Means, PageRankLack of wire-compatible protocols Client and cluster must be of same versionApplications and workflows cannot migrate to different clusters
  • 6. Next Generation Map-Reduce RequirementsReliabilityAvailabilityScalability - Clusters of 6,000 machinesEach machine with 16 cores, 48G RAM, 24TB disks100,000 concurrent tasks10,000 concurrent jobsWire CompatibilityAgility & Evolution – Ability for customers to control upgrades to the grid software stack.
  • 7. Next Generation Map-Reduce – Design CentreSplit up the two major functions of JobTrackerCluster resource managementApplication life-cycle managementMap-Reduce becomes user-land library
  • 9. ArchitectureResource ManagerGlobal resource schedulerHierarchical queuesNode ManagerPer-machine agentManages the life-cycle of containerContainer resource monitoringApplication MasterPer-applicationManages application scheduling and task executionE.g. Map-Reduce Application Master
  • 10. Improvements vis-à-vis current Map-ReduceScalability Application life-cycle management is very expensivePartition resource management and application life-cycle managementApplication management is distributedHardware trends - Currently run clusters of 4,000 machines6,000 2012 machines > 12,000 2009 machines<8 cores, 16G, 4TB> v/s <16+ cores, 48/96G, 24TB>
  • 11. Improvements vis-à-vis current Map-ReduceAvailability Application MasterOptional failover via application-specific checkpointMap-Reduce applications pick up where they left offResource ManagerNo single point of failure - failover via ZooKeeperApplication Masters are restarted automatically
  • 12. Improvements vis-à-vis current Map-ReduceWire Compatibility Protocols are wire-compatibleOld clients can talk to new serversRolling upgrades
  • 13. Improvements vis-à-vis current Map-ReduceAgility / Evolution Map-Reduce now becomes a user-land libraryMultiple versions of Map-Reduce can run in the same cluster (ala Apache Pig)Faster deployment cycles for improvementsCustomers upgrade Map-Reduce versions on their schedule
  • 14. Improvements vis-à-vis current Map-ReduceUtilizationGeneric resource model MemoryCPUDisk b/wNetwork b/wRemove fixed partition of map and reduce slots
  • 15. Improvements vis-à-vis current Map-ReduceSupport for programming paradigms other than Map-ReduceMPIMaster-WorkerMachine LearningIterative processingEnabled by allowing use of paradigm-specific Application MasterRun all on the same Hadoop cluster
  • 16. SummaryThe next generation of Map-Reduce takes Hadoop to the next levelScale-out even furtherHigh availabilityCluster Utilization Support for paradigms other than Map-Reduce