SlideShare a Scribd company logo
How to monitor the  $H!T out of Hadoop Developing a comprehensive open approach to monitoring hadoop clusters
Relevant Hadoop Information From 3 – 3000 Nodes Hardware/Software failures “common” Redundant Components DataNode, TaskTracker Non-redundant Components NameNode, JobTracker, SecondaryNameNode Fast Evolving Technology (Best Practices?)
Monitoring Software Nagios –  Red Yellow Green Alerts, Escalations Defacto Standard – Widely deployed Text base configuration Web Interface Pluggable with shell scripts/external apps Return 0 - OK
Cacti Performance Graphing System RRD/RRA Front End Slick Web Interface Template System for Graph Types Pluggable SNMP input Shell script /external program

Recommended for you

Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application

This document provides guidelines for tuning Hadoop for performance. It discusses key factors that influence Hadoop performance like hardware configuration, application logic, and system bottlenecks. It also outlines various configuration parameters that can be tuned at the cluster and job level to optimize CPU, memory, disk throughput, and task granularity. Sample tuning gains are shown for a webmap application where tuning multiple parameters improved job execution time by up to 22%.

indiahadoopsummit2010
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster

Three identical Apache Hadoop clusters were provisioned on Joyent infrastructure using different operating systems: SmartOS, Ubuntu, and KVM virtual machines. Monitoring showed the Ubuntu and KVM clusters spent more time in the OS kernel during I/O operations compared to the SmartOS cluster. The SmartOS cluster was able to utilize CPU resources more efficiently and scale to more mappers and reducers. Basic cluster configuration and tuning the number of map and reduce tasks are important to optimize Hadoop performance.

apache hadoopnosqlhadoop
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera

Performance is a thing that you can never have too much of. But performance is a nebulous concept in Hadoop. Unlike databases, there is no equivalent in Hadoop to TPC, and different use cases experience performance differently. This talk will discuss advances on how Hadoop performance is measured and will also talk about recent and future advances in performance in different areas of the Hadoop stack.

hadoop worldhw2011hadoop and performace
 
hadoop-cacti-jtg JMX Fetching Code w/ (kick off) scripts Cacti templates For Hadoop Premade Nagios Check Scripts Helper/Batch/automation scripts Apache License
Hadoop JMX
Sample Cluster P1 NameNode & SecNameNode Hardware RAID 8 GB RAM 1x QUAD CORE DerbyDB (hive) on SecNameNode JobTracker 8GB RAM 1x QUAD CORE

Recommended for you

White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning

This document summarizes key Hadoop configuration parameters that affect MapReduce job performance and provides suggestions for optimizing these parameters under different conditions. It describes the MapReduce workflow and phases, defines important parameters like dfs.block.size, mapred.compress.map.output, and mapred.tasktracker.map/reduce.tasks.maximum. It explains how to configure these parameters based on factors like cluster size, data and task complexity, and available resources. The document also discusses other performance aspects like temporary space, JVM tuning, and reducing reducer initialization overhead.

From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way

Creating containers for an application is easy (even if it’s a goold old distributed application like Apache Hadoop), just a few steps of packaging. The hard part isn't packaging: it's deploying How can we run the containers together? How to configure them? How do the services in the containers find and talk to each other? How do you deploy and manage clusters with hundred of nodes? Modern cloud native tools like Kubernetes or Consul/Nomad could help a lot but they could be used in different way. It this presentation I will demonstrate multiple solutions to manage containerized clusters with different cloud-native tools including kubernetes, and docker-swarm/compose. No matter which tools you use, the same questions of service discovery and configuration management arise. This talk will show the key elements needed to make that containerized cluster work. Tools: kubernetes, docker-swam, docker-compose, consul, consul-template, nomad together with: Hadoop, Yarn, Spark, Kafka, Zookeeper, Storm…. References: https://github.com/flokkr Speaker Marton Elek, Lead Software Engineer, Hortonworks

apache hadoopclouddocker / container
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)

Hadoop is emerging as the standard for big data processing and analytics. However, as usage of the Hadoop clusters grow, so do the demands of managing and monitoring these systems. In this full-day Strata Hadoop World tutorial, attendees will get an overview of all phases for successfully managing Hadoop clusters, with an emphasis on production systems — from installation, to configuration management, service monitoring, troubleshooting and support integration. We will review tooling capabilities and highlight the ones that have been most helpful to users, and share some of the lessons learned and best practices from users who depend on Hadoop as a business-critical system.

hadoop
A Sample Cluster p2 Slave (hadoopdata1-XXXX) JBOD 8x 1TB SATA Disk RAM 16GB 2x Quad Core
Prerequisites Nagios (install) DAG RPMs Cacti (install) Several RPMS Liberal network access to the cluster
Alerts & Escalations X nodes * Y Services = < Sleep Define a policy  Wake Me Up’s (SMS) Don’t Wake Me Up’s (EMAIL) Review (Daily, Weekly, Monthly)
Wake Me Up’s NameNode Disk Full (Big Big Headache) RAID Array Issues (failed disk) JobTracker SecNameNode Do not realize it is not working too late

Recommended for you

Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan

Hadoop is an open-source framework that processes large datasets in a distributed manner across commodity hardware. It uses a distributed file system (HDFS) and MapReduce programming model to store and process data. Hadoop is highly scalable, fault-tolerant, and reliable. It can handle data volumes and variety including structured, semi-structured and unstructured data.

hadoopinstall
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration

1. The document provides 7 simple Linux configuration tips to improve Hadoop cluster performance. The tips include disabling swapping, mounting data disks with noatime, disabling root reserved space, enabling nscd, increasing file handle limits, using a dedicated OS disk, and ensuring proper name resolution. 2. It also discusses additional optional tips like checking disk I/O, disabling transparent huge pages, enabling jumbo frames, and monitoring systems. 3. The document recommends reading the Hadoop Operations book and taking questions.

Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned

This document provides an overview and lessons learned from Hadoop. It discusses why Hadoop is used, how MapReduce and HDFS work, tips for integration and operations, and the outlook for the Hadoop community moving forward with real-time capabilities and refined APIs. Key takeaways include only using Hadoop if necessary, fully understanding your data pipeline, and "unboxing the black box" of Hadoop.

hadoop mapreduce
Don’t Wake Me Up’s Or ‘Wake someone else up’ DataNode Warning Currently Failed Disk will down the Data Node (see Jira) TaskTracker Hardware Bad Disk (Start RMA) Slaves are expendable (up to a point)
Monitoring Battle Plan Start With the Basics Ping, Disk Add Hadoop Specific Alarms  check_data_node Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
The Basics Nagios Nagios (All Nodes) Host up (Ping check) Disk % Full SWAP > 85 % * Load based alarms are somewhat useless  389% CPU load is not necessarily a bad thing in Hadoopville
The Basics Cacti Cacti (All Nodes) CPU (full CPU) RAM/SWAP  Network Disk Usage

Recommended for you

Tune hadoop
Tune hadoopTune hadoop
Tune hadoop

This document provides tips for tuning Hadoop clusters and jobs. It recommends: 1) Choosing optimal numbers of mappers and reducers per node and oversubscribing CPUs slightly. 2) Adjusting memory allocations for tasks and ensuring they do not exceed total memory available. 3) Increasing buffers for sorting and shuffling, compressing intermediate data, and using combiners to reduce data sent to reducers.

tuninghadoopperformance
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners

This presentation provides a basic overview on Hadoop, Map-Reduce and HDFS related concepts, Configuration and Installation steps and a Sample code.

google file systemmapreducehadoop
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides

This document provides an overview of setting up a Hadoop cluster, including installing the Apache Hadoop distribution, configuring SSH keys for passwordless login between nodes, configuring environment variables and Hadoop configuration files, and starting and stopping the HDFS and MapReduce services. It also briefly discusses alternative Hadoop distributions from Cloudera and Yahoo, as well as using cloud platforms like Amazon EC2 for Hadoop clusters.

Disk Utilization
RAID Tools Hpacucli – not a Street Fighter move Alerts on RAID events (NameNode)  Disk failed  Rebuilding JBOD (DataNode) Failed Drive Drive Errors Dell, SUN, Vendor Specific Tools
Before you jump in X Nodes * Y Checks * = Lots of work About 3 Nodes into the process … Wait!!! I need some interns!!! Solution S.I.C.C.T.  Semi-Intelligent-Configuration-cloning-tools (I made that up)  (for this presentation)
Nagios Answers “IS IT RUNNING?” Text based Configuration

Recommended for you

Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability

With the advent of Hadoop, there comes the need for professionals skilled in Hadoop Administration making it imperative to be skilled as a Hadoop Admin for better career, salary and job opportunities. Know how to setup a Hadoop Cluster With HDFS High Availability here : www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/

hadoop clustershigh availability
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5

Are you taking advantage of all of Hadoop’s features to operate a stable and effective cluster? Inspired by real-world support cases, this talk discusses best practices and new features to help improve incident response and daily operations. Chances are that you’ll walk away from this talk with some new ideas to implement in your own clusters.

hdfshadoophadoopsummit
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers

This document provides an overview of the Hadoop Distributed File System (HDFS), including its goals, design, daemons, and processes for reading and writing files. HDFS is designed for storing very large files across commodity servers, and provides high throughput and reliability through replication. The key components are the NameNode, which manages metadata, and DataNodes, which store data blocks. The Secondary NameNode assists the NameNode in checkpointing filesystem state periodically.

big databig data traininghadoop training
Cacti Answers “HOW WELL IS IT RUNNING?” Web Based configuration  php-cli tools
Monitoring Battle Plan Thus Far Start With the Basics Ping, Disk !!!!!!Done!!!!!! Add Hadoop Specific Alarms  check_data_node Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
Add Hadoop Specific Alarms Hadoop Components with a Web Interface NameNode 50070 JobTracker 50030 TaskTracker 50060 DataNode 50075 check_http + regex = simple + effective
nagios_check_commands.cfg Component Failure (Future) Newer Hadoop will have XML status  define command { command_name  check_remote_namenode command_line  $USER1$/check_http -H  $HOSTADDRESS$ -u http://$HOSTADDRESS$:$ARG1$/dfshealth.jsp -p $ARG1$ -r NameNode } define service {                service_description            check_remote_namenode                use                              generic-service                host_name                        hadoopname1                check_command               check_remote_namenode!50070 }

Recommended for you

Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)

Hadoop is a distributed processing framework for large datasets. It stores data across clusters of commodity hardware in a Hadoop Distributed File System (HDFS) and provides tools for distributed processing using MapReduce. HDFS uses a master-slave architecture with a namenode managing metadata and datanodes storing data blocks. Data is replicated across nodes for reliability. MapReduce allows distributed processing of large datasets in parallel across clusters.

mapreducehadooparchitecture
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop

This document provides an overview and introduction to Hadoop, an open-source framework for storing and processing large datasets in a distributed computing environment. It discusses what Hadoop is, common use cases like ETL and analysis, key architectural components like HDFS and MapReduce, and why Hadoop is useful for solving problems involving "big data" through parallel processing across commodity hardware.

hadoopmapreduce
Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance

The document discusses optimizing performance in MapReduce jobs. It covers understanding bottlenecks through metrics and logs, tuning parameters to reduce spills during the map task sort and spill phase like io.sort.mb and io.sort.record.percent, and tips for reducer fetch tuning. The goal is to help developers understand and address bottlenecks in their MapReduce jobs to improve performance.

clouderaapache hadoophadoop summit
Monitoring Battle Plan Start With the Basics Ping, Disk (Done) Add Hadoop Specific Alarms  check_data_node (Done) Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
JMX Graphing Enable JMX Import Templates
JMX Graphing
JMX Graphing

Recommended for you

Hadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingHadoop & Big Data benchmarking
Hadoop & Big Data benchmarking

This document discusses benchmarking Hadoop and big data systems. It provides an overview of common Hadoop benchmarks including microbenchmarks like TestDFSIO, TeraSort, and NNBench which test individual Hadoop components. It also describes BigBench, a benchmark modeled after TPC-DS that aims to test a more complete big data analytics workload using techniques like MapReduce, Hive, and Mahout across structured, semi-structured, and unstructured data. The document emphasizes using Hadoop distributions for administration and both microbenchmarks and full benchmarks like BigBench for evaluation.

big benchclouderabenchmarks
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning

This document provides an agenda and overview for a presentation on Hadoop 2.x configuration and MapReduce performance tuning. The presentation covers hardware selection and capacity planning for Hadoop clusters, key configuration parameters for operating systems, HDFS, and YARN, and performance tuning techniques for MapReduce applications. It also demonstrates the Hadoop Vaidya performance diagnostic tool.

configurationhdfsvaidya
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark

This presentation is an analysis of the observed trends in the transition from the Hadoop ecosystem to the Spark ecosystem. The related talk took place at the Chicago Hadoop User Group (CHUG) meetup held on February 12, 2015.

mapreducehadoopspark
JMX Graphing
 
Standard Java JMX
Monitoring Battle Plan Thus Far Start With the Basics !!!!!!Done!!!!! Ping, Disk Add Hadoop Specific Alarms !Done! check_data_node Add JMX Graphing !Done! NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%

Recommended for you

Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery

The document discusses backup and disaster recovery strategies for Hadoop. It focuses on protecting data sets stored in HDFS. HDFS uses data replication and checksums to protect against disk and node failures. Snapshots can protect against data corruption and accidental deletes. The document recommends copying data from the primary to secondary site for disaster recovery rather than teeing, and discusses considerations for large data movement like bandwidth needs and security. It also notes the importance of backing up metadata like Hive configurations along with core data.

apache hadoophadoop recoveryhadoop backup
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop

Hadoop, flexible and available architecture for large scale computation and data processing on a network of commodity hardware.

hadoophbasehive
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南

Tips for Ganglia on Prototype, Test, and more.

ptototypeganglia
Add JMX based Alarms hadoop-cacti-jtg is flexible extend fetch classes Don’t call output() Write your own check logic
Quick JMX Base Walkthrough  url, user, pass, object specified from CLI wantedVariables, wantedOperations by inheritance fetch() output() provided
Extend for NameNode
Extend for Nagios

Recommended for you

mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2

TV audiences are demanding the ability to watch their favorite programs whenever it’s most convenient for them – whether that be during a live broadcast, in a few hours, in a few days, or even in a few weeks post-broadcast.

live encoderstart-over videovod
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities

Enterprise workspaces empower employees to create flexible, personalized, self-service and mobile ready work environments that combine business applications and content with superior user experience and advanced social and collaborative capabilities - anywhere, anytime. With enterprise workspaces companies can drive people productivity, expand the value of their IT investments and delight business users

sap portal mobile workspaces
Energy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energeticaEnergy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energetica
Monitoring Battle Plan Start With the Basics !DONE! Ping, Disk Add Hadoop Specific Alarms !DONE! check_data_node Add JMX Graphing !DONE! NameNodeOperations Add JMX Based alarms !DONE! FilesTotal > 1,000,000 or LiveNodes < 50%
Review File System Growth Size Number of Files Number of Blocks Ratio’s Utilization CPU/Memory Disk Email (nightly) FSCK  DSFADMIN
The Future JMX Coming to JobTracker and TaskTracker (0.21) Collect and Graph Jobs Running Collect and Graph Map / Reduce per node Profile Specific Jobs in Cacti?

More Related Content

What's hot

Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
Alex Moundalexis
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
Edureka!
 
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Yahoo Developer Network
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
Altoros
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Cloudera, Inc.
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
Anil Reddy
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way
DataWorks Summit
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
Kathleen Ting
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
Narayana B
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
DataWorks Summit
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
tcurdt
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
Jason Shao
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
Rahul Jain
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
ryancox
 
Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
Edureka!
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Chris Nauroth
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
Kalyan Hadoop
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hari Shankar Sreekumar
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Ovidiu Dimulescu
 

What's hot (20)

Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
 
Introduction to hadoop administration jk
Introduction to hadoop administration   jkIntroduction to hadoop administration   jk
Introduction to hadoop administration jk
 
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
From docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native wayFrom docker to kubernetes: running Apache Hadoop in a cloud native way
From docker to kubernetes: running Apache Hadoop in a cloud native way
 
Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)Hadoop Operations for Production Systems (Strata NYC)
Hadoop Operations for Production Systems (Strata NYC)
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Hadoop - Lessons Learned
Hadoop - Lessons LearnedHadoop - Lessons Learned
Hadoop - Lessons Learned
 
Tune hadoop
Tune hadoopTune hadoop
Tune hadoop
 
Hadoop & HDFS for Beginners
Hadoop & HDFS for BeginnersHadoop & HDFS for Beginners
Hadoop & HDFS for Beginners
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
Hadoop Cluster With High Availability
Hadoop Cluster With High AvailabilityHadoop Cluster With High Availability
Hadoop Cluster With High Availability
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
Hadoop architecture (Delhi Hadoop User Group Meetup 10 Sep 2011)
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 

Viewers also liked

Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
DataWorks Summit
 
Hadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingHadoop & Big Data benchmarking
Hadoop & Big Data benchmarking
Bart Vandewoestyne
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
Vitthal Gogate
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
Cloudera, Inc.
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南
Schubert Zhang
 
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
thePlatform
 
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
SAP Portal
 
Energy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energeticaEnergy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energetica
Eugenio Bacile di Castiglione
 
Nt1310 project
Nt1310 projectNt1310 project
Nt1310 project
Nathan Pennington
 
Credit cards
Credit cardsCredit cards
Credit cards
ThePointsGuy
 
Basics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical BillingBasics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical Billing
Outsource Strategies International
 
Context Based Authentication
Context Based AuthenticationContext Based Authentication
Context Based Authentication
PortalGuard dba PistolStar, Inc.
 
Secure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the WebSecure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the Web
SafeNet
 
cathy resume
cathy resumecathy resume

Viewers also liked (16)

Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
 
Hadoop & Big Data benchmarking
Hadoop & Big Data benchmarkingHadoop & Big Data benchmarking
Hadoop & Big Data benchmarking
 
Hadoop configuration & performance tuning
Hadoop configuration & performance tuningHadoop configuration & performance tuning
Hadoop configuration & performance tuning
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 
Hadoop Backup and Disaster Recovery
Hadoop Backup and Disaster RecoveryHadoop Backup and Disaster Recovery
Hadoop Backup and Disaster Recovery
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
Ganglia轻度使用指南
Ganglia轻度使用指南Ganglia轻度使用指南
Ganglia轻度使用指南
 
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
mpx Replay, Expedite Your Catch-Up and C3 Workflow 2 of 2
 
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
Enterprise workspaces - Extending SAP NetWeaver Portal capabilities
 
Energy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energeticaEnergy Strategy Group_Report 2012 efficienza energetica
Energy Strategy Group_Report 2012 efficienza energetica
 
Nt1310 project
Nt1310 projectNt1310 project
Nt1310 project
 
Credit cards
Credit cardsCredit cards
Credit cards
 
Basics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical BillingBasics of Coding in Pediatrics Medical Billing
Basics of Coding in Pediatrics Medical Billing
 
Context Based Authentication
Context Based AuthenticationContext Based Authentication
Context Based Authentication
 
Secure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the WebSecure PIN Management How to Issue and Change PINs Securely over the Web
Secure PIN Management How to Issue and Change PINs Securely over the Web
 
cathy resume
cathy resumecathy resume
cathy resume
 

Similar to Hadoop Monitoring best Practices

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios
 
Deployment with Fabric
Deployment with FabricDeployment with Fabric
Deployment with Fabric
andymccurdy
 
Dynamic Hadoop Clusters
Dynamic Hadoop ClustersDynamic Hadoop Clusters
Dynamic Hadoop Clusters
Steve Loughran
 
Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShell
Boulos Dib
 
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios
 
Django deployment with PaaS
Django deployment with PaaSDjango deployment with PaaS
Django deployment with PaaS
Appsembler
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
Cloudera, Inc.
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015
Hao Chen
 
Planning for-high-performance-web-application
Planning for-high-performance-web-applicationPlanning for-high-performance-web-application
Planning for-high-performance-web-application
Nguyễn Duy Nhân
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
Wisely chen
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
Wim Godden
 
Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018
Ortus Solutions, Corp
 
Into The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and dockerInto The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and docker
Ortus Solutions, Corp
 
Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoop
Ambuj Kumar
 
FP - Découverte de Play Framework Scala
FP - Découverte de Play Framework ScalaFP - Découverte de Play Framework Scala
FP - Découverte de Play Framework Scala
Kévin Margueritte
 
Monitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTapMonitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTap
Padraig O'Sullivan
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
Ashok Modi
 
Deep learning - the conf br 2018
Deep learning - the conf br 2018Deep learning - the conf br 2018
Deep learning - the conf br 2018
Fabio Janiszevski
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
StackIQ
 
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
BradNeuberg
 

Similar to Hadoop Monitoring best Practices (20)

Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...Nagios Conference 2011 - Daniel Wittenberg -  Scaling Nagios At A Giant Insur...
Nagios Conference 2011 - Daniel Wittenberg - Scaling Nagios At A Giant Insur...
 
Deployment with Fabric
Deployment with FabricDeployment with Fabric
Deployment with Fabric
 
Dynamic Hadoop Clusters
Dynamic Hadoop ClustersDynamic Hadoop Clusters
Dynamic Hadoop Clusters
 
Introduction to PowerShell
Introduction to PowerShellIntroduction to PowerShell
Introduction to PowerShell
 
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
 
Django deployment with PaaS
Django deployment with PaaSDjango deployment with PaaS
Django deployment with PaaS
 
Hw09 Production Deep Dive With High Availability
Hw09   Production Deep Dive With High AvailabilityHw09   Production Deep Dive With High Availability
Hw09 Production Deep Dive With High Availability
 
Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015Eagle from eBay at China Hadoop Summit 2015
Eagle from eBay at China Hadoop Summit 2015
 
Planning for-high-performance-web-application
Planning for-high-performance-web-applicationPlanning for-high-performance-web-application
Planning for-high-performance-web-application
 
Osd ctw spark
Osd ctw sparkOsd ctw spark
Osd ctw spark
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018Going live with BommandBox and docker Into The Box 2018
Going live with BommandBox and docker Into The Box 2018
 
Into The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and dockerInto The Box 2018 Going live with commandbox and docker
Into The Box 2018 Going live with commandbox and docker
 
Basics of big data analytics hadoop
Basics of big data analytics hadoopBasics of big data analytics hadoop
Basics of big data analytics hadoop
 
FP - Découverte de Play Framework Scala
FP - Découverte de Play Framework ScalaFP - Découverte de Play Framework Scala
FP - Découverte de Play Framework Scala
 
Monitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTapMonitoring MySQL with DTrace/SystemTap
Monitoring MySQL with DTrace/SystemTap
 
DrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performanceDrupalCampLA 2011: Drupal backend-performance
DrupalCampLA 2011: Drupal backend-performance
 
Deep learning - the conf br 2018
Deep learning - the conf br 2018Deep learning - the conf br 2018
Deep learning - the conf br 2018
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
 
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
Beyond Cookies, Persistent Storage For Web Applications Web Directions North ...
 

More from Edward Capriolo

Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
Edward Capriolo
 
Web-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batchWeb-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batch
Edward Capriolo
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
Edward Capriolo
 
Cassandra4hadoop
Cassandra4hadoopCassandra4hadoop
Cassandra4hadoop
Edward Capriolo
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
Edward Capriolo
 
M6d cassandra summit
M6d cassandra summitM6d cassandra summit
M6d cassandra summit
Edward Capriolo
 
Apache Kafka Demo
Apache Kafka DemoApache Kafka Demo
Apache Kafka Demo
Edward Capriolo
 
Cassandra NoSQL Lan party
Cassandra NoSQL Lan partyCassandra NoSQL Lan party
Cassandra NoSQL Lan party
Edward Capriolo
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
Edward Capriolo
 
Breaking first-normal form with Hive
Breaking first-normal form with HiveBreaking first-normal form with Hive
Breaking first-normal form with Hive
Edward Capriolo
 
Casbase presentation
Casbase presentationCasbase presentation
Casbase presentation
Edward Capriolo
 
Whirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveWhirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIve
Edward Capriolo
 
Cli deep dive
Cli deep diveCli deep dive
Cli deep dive
Edward Capriolo
 
Cassandra as Memcache
Cassandra as MemcacheCassandra as Memcache
Cassandra as Memcache
Edward Capriolo
 
Counters for real-time statistics
Counters for real-time statisticsCounters for real-time statistics
Counters for real-time statistics
Edward Capriolo
 
Real world capacity
Real world capacityReal world capacity
Real world capacity
Edward Capriolo
 

More from Edward Capriolo (16)

Nibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL storeNibiru: Building your own NoSQL store
Nibiru: Building your own NoSQL store
 
Web-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batchWeb-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batch
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 
Cassandra4hadoop
Cassandra4hadoopCassandra4hadoop
Cassandra4hadoop
 
Intravert Server side processing for Cassandra
Intravert Server side processing for CassandraIntravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
 
M6d cassandra summit
M6d cassandra summitM6d cassandra summit
M6d cassandra summit
 
Apache Kafka Demo
Apache Kafka DemoApache Kafka Demo
Apache Kafka Demo
 
Cassandra NoSQL Lan party
Cassandra NoSQL Lan partyCassandra NoSQL Lan party
Cassandra NoSQL Lan party
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
Breaking first-normal form with Hive
Breaking first-normal form with HiveBreaking first-normal form with Hive
Breaking first-normal form with Hive
 
Casbase presentation
Casbase presentationCasbase presentation
Casbase presentation
 
Whirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveWhirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIve
 
Cli deep dive
Cli deep diveCli deep dive
Cli deep dive
 
Cassandra as Memcache
Cassandra as MemcacheCassandra as Memcache
Cassandra as Memcache
 
Counters for real-time statistics
Counters for real-time statisticsCounters for real-time statistics
Counters for real-time statistics
 
Real world capacity
Real world capacityReal world capacity
Real world capacity
 

Recently uploaded

Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
BookNet Canada
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
Aurora Consulting
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
Adam Dunkels
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
ishalveerrandhawa1
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
huseindihon
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Erasmo Purificato
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Bert Blevins
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
Larry Smarr
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
ScyllaDB
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
welrejdoall
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
Kief Morris
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 

Recently uploaded (20)

Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024Details of description part II: Describing images in practice - Tech Forum 2024
Details of description part II: Describing images in practice - Tech Forum 2024
 
Quality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of TimeQuality Patents: Patents That Stand the Test of Time
Quality Patents: Patents That Stand the Test of Time
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
How to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptxHow to Build a Profitable IoT Product.pptx
How to Build a Profitable IoT Product.pptx
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
Calgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptxCalgary MuleSoft Meetup APM and IDP .pptx
Calgary MuleSoft Meetup APM and IDP .pptx
 
find out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challengesfind out more about the role of autonomous vehicles in facing global challenges
find out more about the role of autonomous vehicles in facing global challenges
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
Paradigm Shifts in User Modeling: A Journey from Historical Foundations to Em...
 
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly DetectionAdvanced Techniques for Cyber Security Analysis and Anomaly Detection
Advanced Techniques for Cyber Security Analysis and Anomaly Detection
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
The Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive ComputingThe Rise of Supernetwork Data Intensive Computing
The Rise of Supernetwork Data Intensive Computing
 
Measuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at TwitterMeasuring the Impact of Network Latency at Twitter
Measuring the Impact of Network Latency at Twitter
 
Manual | Product | Research Presentation
Manual | Product | Research PresentationManual | Product | Research Presentation
Manual | Product | Research Presentation
 
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
[Talk] Moving Beyond Spaghetti Infrastructure [AOTB] 2024-07-04.pdf
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 

Hadoop Monitoring best Practices

  • 1. How to monitor the $H!T out of Hadoop Developing a comprehensive open approach to monitoring hadoop clusters
  • 2. Relevant Hadoop Information From 3 – 3000 Nodes Hardware/Software failures “common” Redundant Components DataNode, TaskTracker Non-redundant Components NameNode, JobTracker, SecondaryNameNode Fast Evolving Technology (Best Practices?)
  • 3. Monitoring Software Nagios – Red Yellow Green Alerts, Escalations Defacto Standard – Widely deployed Text base configuration Web Interface Pluggable with shell scripts/external apps Return 0 - OK
  • 4. Cacti Performance Graphing System RRD/RRA Front End Slick Web Interface Template System for Graph Types Pluggable SNMP input Shell script /external program
  • 5.  
  • 6. hadoop-cacti-jtg JMX Fetching Code w/ (kick off) scripts Cacti templates For Hadoop Premade Nagios Check Scripts Helper/Batch/automation scripts Apache License
  • 8. Sample Cluster P1 NameNode & SecNameNode Hardware RAID 8 GB RAM 1x QUAD CORE DerbyDB (hive) on SecNameNode JobTracker 8GB RAM 1x QUAD CORE
  • 9. A Sample Cluster p2 Slave (hadoopdata1-XXXX) JBOD 8x 1TB SATA Disk RAM 16GB 2x Quad Core
  • 10. Prerequisites Nagios (install) DAG RPMs Cacti (install) Several RPMS Liberal network access to the cluster
  • 11. Alerts & Escalations X nodes * Y Services = < Sleep Define a policy Wake Me Up’s (SMS) Don’t Wake Me Up’s (EMAIL) Review (Daily, Weekly, Monthly)
  • 12. Wake Me Up’s NameNode Disk Full (Big Big Headache) RAID Array Issues (failed disk) JobTracker SecNameNode Do not realize it is not working too late
  • 13. Don’t Wake Me Up’s Or ‘Wake someone else up’ DataNode Warning Currently Failed Disk will down the Data Node (see Jira) TaskTracker Hardware Bad Disk (Start RMA) Slaves are expendable (up to a point)
  • 14. Monitoring Battle Plan Start With the Basics Ping, Disk Add Hadoop Specific Alarms check_data_node Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
  • 15. The Basics Nagios Nagios (All Nodes) Host up (Ping check) Disk % Full SWAP > 85 % * Load based alarms are somewhat useless 389% CPU load is not necessarily a bad thing in Hadoopville
  • 16. The Basics Cacti Cacti (All Nodes) CPU (full CPU) RAM/SWAP Network Disk Usage
  • 18. RAID Tools Hpacucli – not a Street Fighter move Alerts on RAID events (NameNode) Disk failed Rebuilding JBOD (DataNode) Failed Drive Drive Errors Dell, SUN, Vendor Specific Tools
  • 19. Before you jump in X Nodes * Y Checks * = Lots of work About 3 Nodes into the process … Wait!!! I need some interns!!! Solution S.I.C.C.T. Semi-Intelligent-Configuration-cloning-tools (I made that up) (for this presentation)
  • 20. Nagios Answers “IS IT RUNNING?” Text based Configuration
  • 21. Cacti Answers “HOW WELL IS IT RUNNING?” Web Based configuration php-cli tools
  • 22. Monitoring Battle Plan Thus Far Start With the Basics Ping, Disk !!!!!!Done!!!!!! Add Hadoop Specific Alarms check_data_node Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
  • 23. Add Hadoop Specific Alarms Hadoop Components with a Web Interface NameNode 50070 JobTracker 50030 TaskTracker 50060 DataNode 50075 check_http + regex = simple + effective
  • 24. nagios_check_commands.cfg Component Failure (Future) Newer Hadoop will have XML status define command { command_name check_remote_namenode command_line $USER1$/check_http -H $HOSTADDRESS$ -u http://$HOSTADDRESS$:$ARG1$/dfshealth.jsp -p $ARG1$ -r NameNode } define service {                service_description            check_remote_namenode                use                             generic-service                host_name                       hadoopname1                check_command               check_remote_namenode!50070 }
  • 25. Monitoring Battle Plan Start With the Basics Ping, Disk (Done) Add Hadoop Specific Alarms check_data_node (Done) Add JMX Graphing NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
  • 26. JMX Graphing Enable JMX Import Templates
  • 30.  
  • 32. Monitoring Battle Plan Thus Far Start With the Basics !!!!!!Done!!!!! Ping, Disk Add Hadoop Specific Alarms !Done! check_data_node Add JMX Graphing !Done! NameNodeOperations Add JMX Based alarms FilesTotal > 1,000,000 or LiveNodes < 50%
  • 33. Add JMX based Alarms hadoop-cacti-jtg is flexible extend fetch classes Don’t call output() Write your own check logic
  • 34. Quick JMX Base Walkthrough url, user, pass, object specified from CLI wantedVariables, wantedOperations by inheritance fetch() output() provided
  • 37. Monitoring Battle Plan Start With the Basics !DONE! Ping, Disk Add Hadoop Specific Alarms !DONE! check_data_node Add JMX Graphing !DONE! NameNodeOperations Add JMX Based alarms !DONE! FilesTotal > 1,000,000 or LiveNodes < 50%
  • 38. Review File System Growth Size Number of Files Number of Blocks Ratio’s Utilization CPU/Memory Disk Email (nightly) FSCK DSFADMIN
  • 39. The Future JMX Coming to JobTracker and TaskTracker (0.21) Collect and Graph Jobs Running Collect and Graph Map / Reduce per node Profile Specific Jobs in Cacti?