SlideShare a Scribd company logo
Performance Analysis:
new tools and concepts
from the cloud
Brendan Gregg
Lead Performance Engineer, Joyent
brendan.gregg@joyent.com

SCaLE10x
Jan, 2012
whoami
• I do performance analysis
• I also write performance tools out of necessity
• Was Brendan @ Sun Microsystems, Oracle,
now Joyent
Joyent
• Cloud computing provider
• Cloud computing software
• SmartOS
• host OS, and guest via OS virtualization
• Linux, Windows
• guest via KVM
Agenda
• Data
• Example problems & solutions
• How cloud environments complicate performance
• Theory
• Performance analysis
• Summarize new tools & concepts
• This talk uses SmartOS and DTrace to illustrate
concepts that are applicable to most OSes.

Recommended for you

ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016

Video: https://youtu.be/eO94l0aGLCA?t=3m37s . Talk by Brendan Gregg for ACM Applicative 2016 "System Methodology - Holistic Performance Analysis on Modern Systems Traditional systems performance engineering makes do with vendor-supplied metrics, often involving interpretation and inference, and with numerous blind spots. Much in the field of systems performance is still living in the past: documentation, procedures, and analysis GUIs built upon the same old metrics. For modern systems, we can choose the metrics, and can choose ones we need to support new holistic performance analysis methodologies. These methodologies provide faster, more accurate, and more complete analysis, and can provide a starting point for unfamiliar systems. Methodologies are especially helpful for modern applications and their workloads, which can pose extremely complex problems with no obvious starting point. There are also continuous deployment environments such as the Netflix cloud, where these problems must be solved in shorter time frames. Fortunately, with advances in system observability and tracers, we have virtually endless custom metrics to aid performance analysis. The problem becomes which metrics to use, and how to navigate them quickly to locate the root cause of problems. System methodologies provide a starting point for analysis, as well as guidance for quickly moving through the metrics to root cause. They also pose questions that the existing metrics may not yet answer, which may be critical in solving the toughest problems. System methodologies include the USE method, workload characterization, drill-down analysis, off-CPU analysis, and more. This talk will discuss various system performance issues, and the methodologies, tools, and processes used to solve them. The focus is on single systems (any operating system), including single cloud instances, and quickly locating performance issues or exonerating the system. Many methodologies will be discussed, along with recommendations for their implementation, which may be as documented checklists of tools, or custom dashboards of supporting metrics. In general, you will learn to think differently about your systems, and how to ask better questions."

methodologiesperformance
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux

Tracing Summit 2014, Düsseldorf. What can Linux learn from DTrace: what went well, and what didn't go well, on its path to success? This talk will discuss not just the DTrace software, but lessons from the marketing and adoption of a system tracer, and an inside look at how DTrace was really deployed and used in production environments. It will also cover ongoing problems with DTrace, and how Linux may surpass them and continue to advance the field of system tracing. A world expert and core contributor to DTrace, Brendan now works at Netflix on Linux performance with the various Linux tracers (ftrace, perf_events, eBPF, SystemTap, ktap, sysdig, LTTng, and the DTrace Linux ports), and will summarize his experiences and suggestions for improvements. He has also been contributing to various tracers: recently promoting ftrace and perf_events adoption through articles and front-end scripts, and testing eBPF.

linux performance tracing
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance

CMP325 talk for AWS re:Invent 2017, by Brendan Gregg. " At Netflix we make the best use of AWS EC2 instance types and features to create a high performance cloud, achieving near bare metal speed for our workloads. This session will summarize the configuration, tuning, and activities for delivering the fastest possible EC2 instances, and will help other EC2 users improve performance, reduce latency outliers, and make better use of EC2 features. We'll show how we choose EC2 instance types, how we choose between EC2 Xen modes: HVM, PV, and PVHVM, and the importance of EC2 features such SR-IOV for bare-metal performance. SR-IOV is used by EC2 enhanced networking, and recently for the new i3 instance type for enhanced disk performance as well. We'll also cover kernel tuning and observability tools, from basic to advanced. Advanced performance analysis includes the use of Java and Node.js flame graphs, and the new EC2 Performance Monitoring Counter (PMC) feature released this year."

awsec2performance
Data
• Example problems:
• CPU
• Memory
• Disk
• Network
• Some have neat solutions, some messy, some none
• This is real world
• Some I’ve covered before, some I haven’t
CPU
CPU utilization: problem
• Would like to identify:
• single or multiple CPUs at 100% utilization
• average, minimum and maximum CPU utilization
• CPU utilization balance (tight or loose distribution)
• time-based characteristics
changing/bursting? burst interval, burst length

• For small to large environments
• entire datacenters or clouds
CPU utilization
• mpstat(1) has the data. 1 second, 1 server (16 CPUs):

Recommended for you

SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs

Talk from SREcon2016 by Brendan Gregg. Video: https://www.usenix.org/conference/srecon16/program/presentation/gregg . "There's limited time for performance analysis in the emergency room. When there is a performance-related site outage, the SRE team must analyze and solve complex performance issues as quickly as possible, and under pressure. Many performance tools and techniques are designed for a different environment: an engineer analyzing their system over the course of hours or days, and given time to try dozens of tools: profilers, tracers, monitoring tools, benchmarks, as well as different tunings and configurations. But when Netflix is down, minutes matter, and there's little time for such traditional systems analysis. As with aviation emergencies, short checklists and quick procedures can be applied by the on-call SRE staff to help solve performance issues as quickly as possible. In this talk, I'll cover a checklist for Linux performance analysis in 60 seconds, as well as other methodology-derived checklists and procedures for cloud computing, with examples of performance issues for context. Whether you are solving crises in the SRE war room, or just have limited time for performance engineering, these checklists and approaches should help you find some quick performance wins. Safe flying."

sre performance
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs

This document discusses using Linux perf_events (perf) profiling tools to analyze Java performance on Linux. It describes how perf can provide complete visibility into Java, JVM, GC and system code but that Java profilers have limitations. It presents the solution of using perf to collect mixed-mode flame graphs that include Java method names and symbols. It also discusses fixing issues with broken Java stacks and missing symbols on x86 architectures in perf profiles.

javaperformance
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs

This document summarizes a presentation on flame graphs for profiling CPU and memory performance on FreeBSD. It introduces flame graphs as a way to visualize stack profiles to easily compare performance across systems. Examples are given profiling MySQL workload CPU usage on two hosts to identify a 30% performance difference. Commands are provided to generate flame graphs from DTrace profiles of CPU stack sampling and page faults.

CPU utilization
• Scaling to 60 seconds, 1 server:
CPU utilization
• Scaling to entire datacenter, 60 secs, 5312 CPUs:
CPU utilization
• Line graphs can solve some problems:
• x-axis: time, 60 seconds
• y-axis: utilization
CPU utilization
• ... but don’t scale well to individual devices
• 5312 CPUs, each as a line:

Recommended for you

Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools

Video: https://www.youtube.com/watch?v=FJW8nGV4jxY and https://www.youtube.com/watch?v=zrr2nUln9Kk . Tutorial slides for O'Reilly Velocity SC 2015, by Brendan Gregg. There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This tutorial explains methodologies for using these tools, and provides a tour of four tool types: observability, benchmarking, tuning, and static tuning. Many tools will be discussed, including top, iostat, tcpdump, sar, perf_events, ftrace, SystemTap, sysdig, and others, as well observability frameworks in the Linux kernel: PMCs, tracepoints, kprobes, and uprobes. This tutorial is updated and extended on an earlier talk that summarizes the Linux performance tool landscape. The value of this tutorial is not just learning that these tools exist and what they do, but hearing when and how they are used by a performance engineer to solve real world problems — important context that is typically not included in the standard documentation.

linux performance tools tracing
Systems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudSystems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the Cloud

My talk for BayLISA, Oct 2013, launching the Systems Performance book. Operating system performance analysis and tuning leads to a better end-user experience and lower costs, especially for cloud computing environments that pay by the operating system instance. This book covers concepts, strategy, tools and tuning for Unix operating systems, with a focus on Linux- and Solaris-based systems. The book covers the latest tools and techniques, including static and dynamic tracing, to get the most out of your systems.

performance
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides

Slides from a recent performance meetup at Netflix. A look into how we measure and tune performance for our clients and the streaming service.

netflixperformancestreaming
CPU utilization
• Pretty, but scale limited as well:
CPU utilization
• Utilization as a heat map:
• x-axis: time, y-axis: utilization
• z-axis (color): number of CPUs
CPU utilization
• Available in Cloud Analytics (Joyent)
• Clicking highlights and shows details; eg, hostname:
CPU utilization
• Utilization heat map also suitable and used for:
• disks
• network interfaces
• Utilization as a metric can be a bit misleading
• really a percent busy over a time interval
• devices may accept more work at 100% busy
• may not directly relate to performance impact

Recommended for you

Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013

Your AMI is one of the core foundations for running applications and services effectively on Amazon EC2. In this session, you'll learn how to optimize your AMI, including how you can measure and diagnose system performance and tune parameters for improved CPU and network performance. We'll cover application-specific examples from Netflix on how optimized AMIs can lead to improved performance.

cpuperformanceec2
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance

This document provides a performance engineer's predictions for computing performance trends in 2021 and beyond. The engineer discusses trends in processors, memory, disks, networking, runtimes, kernels, hypervisors, and observability. For processors, predictions include multi-socket systems becoming less common, the future of simultaneous multithreading being unclear, practical core count limits being reached in the 2030s, and more processor vendors including ARM-based and RISC-V options. Memory predictions focus on many workloads being memory-bound currently.

performancecomputingcloud computing
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction

Introduction to DTrace (Dynamic Tracing), written by Brendan Gregg and delivered in 2007. While aimed at a Solaris-based audience, this introduction is still largely relevant today (2012). Since then, DTrace has appeared in other operating systems (Mac OS X, FreeBSD, and is being ported to Linux), and, many user-level providers have been developed to aid tracing of other languages.

dtrace
CPU utilization: summary
• Data readily available
• Using a new visualization
CPU usage
• Given a CPU is hot, what is it doing?
• Beyond just vmstat’s usr/sys ratio
• Profiling (sampling at an interval) the program
counter or stack back trace

• user-land stack for %usr
• kernel stack for %sys
• Many tools can do this to some degree
• Developer Studios/DTrace/oprofile/...
CPU usage: profiling
• Frequency count on-CPU user-land stack traces:
# dtrace -x ustackframes=100 -n 'profile-997 /execname == "mysqld"/ {
@[ustack()] = count(); } tick-60s { exit(0); }'
dtrace: description 'profile-997 ' matched 2 probes
CPU
ID
FUNCTION:NAME
1 75195
:tick-60s
[...]
libc.so.1`__priocntlset+0xa
libc.so.1`getparam+0x83
libc.so.1`pthread_getschedparam+0x3c
libc.so.1`pthread_setschedprio+0x1f
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x9ab
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
4884
mysqld`_Z13add_to_statusP17system_status_varS0_+0x47
mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
5530
CPU usage: profiling
• Frequency count on-CPU user-land stack traces:
# dtrace -x ustackframes=100 -n 'profile-997 /execname == "mysqld"/ {
@[ustack()] = count(); } tick-60s { exit(0); }'
dtrace: description 'profile-997 ' matched 2 probes
CPU
ID
FUNCTION:NAME
1 75195
:tick-60s
[...]
libc.so.1`__priocntlset+0xa
libc.so.1`getparam+0x83
libc.so.1`pthread_getschedparam+0x3c
libc.so.1`pthread_setschedprio+0x1f
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x9ab
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
4884

Over
500,000
lines
truncated

mysqld`_Z13add_to_statusP17system_status_varS0_+0x47
mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67
mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222
mysqld`_Z10do_commandP3THD+0x198
mysqld`handle_one_connection+0x1a6
libc.so.1`_thrp_setup+0x8d
libc.so.1`_lwp_start
5530

Recommended for you

Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets

Talk for USENIX/LISA2014 by Brendan Gregg, Netflix. At Netflix performance is crucial, and we use many high to low level tools to analyze our stack in different ways. In this talk, I will introduce new system observability tools we are using at Netflix, which I've ported from my DTraceToolkit, and are intended for our Linux 3.2 cloud instances. These show that Linux can do more than you may think, by using creative hacks and workarounds with existing kernel features (ftrace, perf_events). While these are solving issues on current versions of Linux, I'll also briefly summarize the future in this space: eBPF, ktap, SystemTap, sysdig, etc.

MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis

MeetBSDCA 2014 Performance Analysis for BSD, by Brendan Gregg. A tour of five relevant topics: observability tools, methodologies, benchmarking, profiling, and tracing. Tools summarized include pmcstat and DTrace.

bsd freebsd performance
EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies

keynote by Brendan Gregg. "Traditional performance monitoring makes do with vendor-supplied metrics, often involving interpretation and inference, and with numerous blind spots. Much in the field of systems performance is still living in the past: documentation, procedures, and analysis GUIs built upon the same old metrics. Modern BSD has advanced tracers and PMC tools, providing virtually endless metrics to aid performance analysis. It's time we really used them, but the problem becomes which metrics to use, and how to navigate them quickly to locate the root cause of problems. There's a new way to approach performance analysis that can guide you through the metrics. Instead of starting with traditional metrics and figuring out their use, you start with the questions you want answered then look for metrics to answer them. Methodologies can provide these questions, as well as a starting point for analysis and guidance for locating the root cause. They also pose questions that the existing metrics may not yet answer, which may be critical in solving the toughest problems. System methodologies include the USE method, workload characterization, drill-down analysis, off-CPU analysis, chain graphs, and more. This talk will discuss various system performance issues, and the methodologies, tools, and processes used to solve them. Many methodologies will be discussed, from the production proven to the cutting edge, along with recommendations for their implementation on BSD systems. In general, you will learn to think differently about analyzing your systems, and make better use of the modern tools that BSD provides."

bsdperformancefreebsd
CPU usage: profiling data
CPU usage: visualization
• Visualized as a “Flame Graph”:
CPU usage: Flame Graphs
• Just some Perl that turns DTrace output into an

interactive SVG: mouse-over elements for details

• It’s on github
• http://github.com/brendangregg/FlameGraph

• Works on kernel stacks, and both user+kernel
• Shouldn’t be hard to have it process oprofile, etc.
CPU usage: on the Cloud
• Flame Graphs were born out of necessity on
Cloud environments:

• Perf issues need quick resolution
(you just got hackernews’d)

• Everyone is running different versions of everything
(don’t assume you’ve seen the last of old CPU-hot
code-path issues that have been fixed)

Recommended for you

Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis

The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It then provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Finally, it demonstrates some example commands like docker stats, systemd-cgtop, and bcc/BPF tools that can be used to analyze containers and cgroups from the host system.

containersperformancelinux
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)

Keynote for PerconaLive 2018 by Brendan Gregg. Video: https://youtu.be/sV3XfrfjrPo?t=30m51s . "At over one thousand code commits per week, it's hard to keep up with Linux developments. This keynote will summarize recent Linux performance features, for a wide audience: the KPTI patches for Meltdown, eBPF for performance observability, Kyber for disk I/O scheduling, BBR for TCP congestion control, and more. This is about exposure: knowing what exists, so you can learn and use it later when needed. Get the most out of your systems, whether they are databases or application servers, with the latest Linux kernels and exciting features."

linuxperformance
Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method

Delivered at the FISL13 conference in Brazil: http://www.youtube.com/watch?v=K9w2cipqfvc This talk introduces the USE Method: a simple strategy for performing a complete check of system performance health, identifying common bottlenecks and errors. This methodology can be used early in a performance investigation to quickly identify the most severe system performance issues, and is a methodology the speaker has used successfully for years in both enterprise and cloud computing environments. Checklists have been developed to show how the USE Method can be applied to Solaris/illumos-based and Linux-based systems. Many hardware and software resource types have been commonly overlooked, including memory and I/O busses, CPU interconnects, and kernel locks. Any of these can become a system bottleneck. The USE Method provides a way to find and identify these. This approach focuses on the questions to ask of the system, before reaching for the tools. Tools that are ultimately used include all the standard performance tools (vmstat, iostat, top), and more advanced tools, including dynamic tracing (DTrace), and hardware performance counters. Other performance methodologies are included for comparison: the Problem Statement Method, Workload Characterization Method, and Drill-Down Analysis Method.

methodologiesperformancemonitoring
CPU usage: summary
• Data can be available
• For cloud computing: easy for operators to fetch on
OS virtualized environments; otherwise agent driven,
and possibly other difficulties (access to CPU
instrumentation counter-based interrupts)

• Using a new visualization
CPU latency
• CPU dispatcher queue latency
• thread is ready-to-run, and waiting its turn
• Observable in coarse ways:
• vmstat’s r
• high load averages
• Less course, with microstate accounting
• prstat -mL’s LAT
• How much is it affecting application performance?
CPU latency: zonedispqlat.d
• Using DTrace to trace kernel scheduler events:
#./zonedisplat.d
Tracing...
Note: outliers (> 1 secs) may be artifacts due to the use of scalar globals
(sorry).
CPU disp queue latency by zone (ns):
dbprod-045
value
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
[...]

------------- Distribution ------------- count
|
0
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@
10210
|@@@@@@@@@@
3829
|@
514
|
94
|
0
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
0
CPU latency: zonedispqlat.d
• CPU dispatcher queue latency by zonename
(zonedispqlat.d), work in progress:

#!/usr/sbin/dtrace -s
#pragma D option quiet
dtrace:::BEGIN
{
printf("Tracing...n");
printf("Note: outliers (> 1 secs) may be artifacts due to the ");
printf("use of scalar globals (sorry).nn");
}
sched:::enqueue
{
/* scalar global (I don't think this can be thread local) */
start[args[0]->pr_lwpid, args[1]->pr_pid] = timestamp;
}
sched:::dequeue
/this->start = start[args[0]->pr_lwpid, args[1]->pr_pid]/
{
this->time = timestamp - this->start;
/* workaround since zonename isn't a member of args[1]... */
this->zone = ((proc_t *)args[1]->pr_addr)->p_zone->zone_name;
@[stringof(this->zone)] = quantize(this->time);
start[args[0]->pr_lwpid, args[1]->pr_pid] = 0;
}
tick-1sec
{
printf("CPU disp queue latency by zone (ns):n");
printa(@);
trunc(@);
}

Save timestamp
on enqueue;
calculate delta
on dequeue

Recommended for you

Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools

Video: http://joyent.com/blog/linux-performance-analysis-and-tools-brendan-gregg-s-talk-at-scale-11x ; This talk for SCaLE11x covers system performance analysis methodologies and the Linux tools to support them, so that you can get the most out of your systems and solve performance issues quickly. This includes a wide variety of tools, including basics like top(1), advanced tools like perf, and new tools like the DTrace for Linux prototypes.

perflinuxdtrace
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs

Delivered as plenary at USENIX LISA 2013. video here: https://www.youtube.com/watch?v=nZfNehCzGdw and https://www.usenix.org/conference/lisa13/technical-sessions/plenary/gregg . "How did we ever analyze performance before Flame Graphs?" This new visualization invented by Brendan can help you quickly understand application and kernel performance, especially CPU usage, where stacks (call graphs) can be sampled and then visualized as an interactive flame graph. Flame Graphs are now used for a growing variety of targets: for applications and kernels on Linux, SmartOS, Mac OS X, and Windows; for languages including C, C++, node.js, ruby, and Lua; and in WebKit Web Inspector. This talk will explain them and provide use cases and new visualizations for other event types, including I/O, memory usage, and latency.

Real-time in the real world: DIRT in production
Real-time in the real world: DIRT in productionReal-time in the real world: DIRT in production
Real-time in the real world: DIRT in production

This document discusses the challenges of building and debugging DIRT (data-intensive real-time) applications in production. It provides examples from the mobile push-to-talk app Voxer, which is described as a canonical DIRT app. Specific issues covered include application restarts inducing latency bubbles, dropped TCP connections causing latency outliers, and identifying sources of slow disk I/O. Tools like DTrace are highlighted as being essential for instrumentation and problem diagnosis in DIRT apps.

CPU latency: zonedispqlat.d
• Instead of zonename, this could be process name, ...
• Tracing scheduler enqueue/dequeue events and
saving timestamps costs CPU overhead

• they are frequent
• I’d prefer to only trace dequeue, and reuse the
existing microstate accounting timestamps

• but one problem is a clash between unscaled and
scaled timestamps
CPU latency: on the Cloud
• With virtualization, you can have:

high CPU latency with idle CPUs
due to an instance consuming their quota

• OS virtualization
• not visible in vmstat r
• is visible as part of prstat -mL’s LAT
• more kstats recently added to SmartOS
including nsec_waitrq (total run queue wait by zone)

• Hardware virtualization
• vmstat st (stolen)
CPU latency: caps
• CPU cap latency from the host (zonecapslat.d):
#!/usr/sbin/dtrace -s
#pragma D option quiet
sched:::cpucaps-sleep
{
start[args[0]->pr_lwpid, args[1]->pr_pid] = timestamp;
}
sched:::cpucaps-wakeup
/this->start = start[args[0]->pr_lwpid, args[1]->pr_pid]/
{
this->time = timestamp - this->start;
/* workaround since zonename isn't a member of args[1]... */
this->zone = ((proc_t *)args[1]->pr_addr)->p_zone->zone_name;
@[stringof(this->zone)] = quantize(this->time);
start[args[0]->pr_lwpid, args[1]->pr_pid] = 0;
}
tick-1sec
{
printf("CPU caps latency by zone (ns):n");
printa(@);
trunc(@);
}
CPU latency: summary
• Partial data available
• New tools/metrics created
• although current DTrace solutions have overhead;
we should be able to improve that

• although, new kstats may be sufficient

Recommended for you

HTTP Application Performance Analysis
HTTP Application Performance AnalysisHTTP Application Performance Analysis
HTTP Application Performance Analysis

HTTP applications concentrate many performance issues: - They are a common way to let internal & external users access and modify data. - They rely on a delivery chain which contains many elements, which are all performance drivers: browser, workstation, network, front-server, application server, file server, database, images, etc... - They raise specific troubleshooting issues: among others, the end user feedback is based on the concept of page, while most network based performance analysis is based on every transaction / object in the page (html, css, image, script, etc.) This one-hour webinar will enable you to: - Understand the challenges of performance troubleshooting for HTTP Applications. - View a series of concrete diagnostic cases with Performance Vision newest version.

npmweb application performanceperformance vision
Performance analysis 2013
Performance analysis 2013Performance analysis 2013
Performance analysis 2013

Performance analysis aims to capture, analyze, and evaluate key components of performance through systematic observation. Coaches observe to better understand technical, tactical, behavioral, and physical aspects of performance. They then provide feedback to improve future practice. Coaches use various methods like notation, video, biomechanics, tests, and questionnaires to gather both qualitative and quantitative data on performance. Technology applications and software programs help support detailed analysis.

a2 pewjec
Web performance optimization (WPO)
Web performance optimization (WPO)Web performance optimization (WPO)
Web performance optimization (WPO)

A presentation from SEO Campixx Barcamp 2011 in Berlin. Web Performance Optimization is about making websites faster. Here i discussed different measures and show the impact on competitive advantage and possibly rankings on Google. Undeniably you can say that better performance leads to more sales and better usability in terms of bouncing rates. View image slides here: http://b0i.de/wpopresentation

wpocrofast website
Memory
Memory: problem
• Riak database has endless memory growth.
• expected 9GB, after two days:
$ prstat -c 1
Please wait...
PID USERNAME SIZE
RSS STATE
21722 103
43G
40G cpu0
15770 root
7760K 540K sleep
95 root
0K
0K sleep
12827 root
128M
73M sleep
10319 bgregg
10M 6788K sleep
10402 root
22M 288K sleep
[...]

PRI NICE
59
0
57
0
99 -20
100
59
0
59
0

TIME
72:23:41
23:28:57
7:37:47
0:49:36
0:00:00
0:18:45

CPU
2.6%
0.9%
0.2%
0.1%
0.0%
0.0%

PROCESS/NLWP
beam.smp/594
zoneadmd/5
zpool-zones/166
node/5
sshd/1
dtrace/1

• Eventually hits paging and terrible performance
• needing a restart
• Is this a memory leak?

Or application growth?
Memory: scope
• Identify the subsystem and team responsible
Subsystem

Team

Application

Voxer

Riak

Basho

Erlang

Ericsson

SmartOS

Joyent
Memory: heap profiling
• What is in the heap?
$ pmap 14719
14719:
beam.smp
0000000000400000
000000000062D000
000000000067F000
00000001005C0000
00000002005BE000
0000000300382000
00000004002E2000
00000004FFFD3000
00000005FFF91000
00000006FFF4C000
00000007FF9EF000
[...]

2168K
328K
4193540K
4194296K
4192016K
4193664K
4191172K
4194040K
4194028K
4188812K
588224K

r-x-rw--rw--rw--rw--rw--rw--rw--rw--rw--rw---

/opt/riak/erts-5.8.5/bin/beam.smp
/opt/riak/erts-5.8.5/bin/beam.smp
/opt/riak/erts-5.8.5/bin/beam.smp
[ anon ]
[ anon ]
[ anon ]
[ anon ]
[ anon ]
[ anon ]
[ anon ]
[ heap ]

• ... and why does it keep growing?
• Would like to answer these in production
• Without restarting apps. Experimentation (backend=mmap,
other allocators) wasn’t working.

Recommended for you

Approaches to Software Testing
Approaches to Software TestingApproaches to Software Testing
Approaches to Software Testing

A presentation that provides an overview of software testing approaches including "schools" of software testing and a variety of testing techniques and practices.

scott barbersoftware testingquality assurance
DTraceCloud2012
DTraceCloud2012DTraceCloud2012
DTraceCloud2012

"DTracing the Cloud", Brendan Gregg, illumosday 2012 Cloud computing facilitates rapid deployment and scaling, often pushing high load at applications under continual development. DTrace allows immediate analysis of issues on live production systems even in these demanding environments – no need to restart or run a special debug kernel. For the illumos kernel, DTrace has been enhanced to support cloud computing, providing more observation capabilities to zones as used by Joyent SmartMachine customers. DTrace is also frequently used by the cloud operators to analyze systems and verify performance isolation of tenants. This talk covers DTrace in the illumos-based cloud, showing examples of real-world performance wins.

The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance

A brief talk on systems performance for the July 2013 meetup "A Midsummer Night's System", video: http://www.youtube.com/watch?v=P3SGzykDE4Q. This summarizes how systems performance has changed from the 1990's to today. This was the reason for writing a new book on systems performance, to provide a reference that is up to date, covering new tools, technologies, and methodologies.

performance
Memory: heap profiling
• libumem was used for multi-threaded performance
• libumem == user-land slab allocator
• detailed observability can be enabled, allowing heap
profiling and leak detection

• While designed with speed and production use in
mind, it still comes with some cost (time and space),
and aren’t on by default.

• UMEM_DEBUG=audit
Memory: heap profiling
• libumem provides some default observability
• Eg, slabs:
> ::umem_malloc_info
CACHE
BUFSZ MAXMAL BUFMALLC
0000000000707028
8
0
0
000000000070b028
16
8
8730
000000000070c028
32
16
8772
000000000070f028
48
32 1148038
0000000000710028
64
48
344138
0000000000711028
80
64
36
0000000000714028
96
80
8934
0000000000715028
112
96 1347040
0000000000718028
128
112
253107
000000000071a028
160
144
40529
000000000071b028
192
176
140
000000000071e028
224
208
43
000000000071f028
256
240
133
0000000000720028
320
304
56
0000000000723028
384
368
35
[...]

AVG_MAL
0
8
16
25
40
62
79
87
111
118
155
188
229
276
335

MALLOCED
0
69836
140352
29127788
13765658
2226
705348
117120208
28011923
4788681
21712
8101
30447
15455
11726

OVERHEAD
0
1054998
1130491
156179051
58417287
4806
1168558
190389780
42279506
6466801
25818
6497
26211
12276
7220

%OVER
0.0%
1510.6%
805.4%
536.1%
424.3%
215.9%
165.6%
162.5%
150.9%
135.0%
118.9%
80.1%
86.0%
79.4%
61.5%
Memory: heap profiling
• ... and heap (captured @14GB RSS):
> ::vmem
ADDR
NAME
fffffd7ffebed4a0 sbrk_top
fffffd7ffebee0a8
sbrk_heap
fffffd7ffebeecb0
vmem_internal
fffffd7ffebef8b8
vmem_seg
fffffd7ffebf04c0
vmem_hash
fffffd7ffebf10c8
vmem_vmem
00000000006e7000
umem_internal
00000000006e8000
umem_cache
00000000006e9000
umem_hash
00000000006ea000
umem_log
00000000006eb000
umem_firewall_va
00000000006ec000
umem_firewall
00000000006ed000
umem_oversize
00000000006f0000
umem_memalign
0000000000706000
umem_default

INUSE
9090404352
9090404352
664616960
651993088
12583424
46200
352862464
113696
13091328
0
0
0
5218777974
0
2552131584

TOTAL
14240165888
9090404352
664616960
651993088
12587008
55344
352866304
180224
13099008
0
0
0
5520789504
0
2552131584

SUCCEED FAIL
4298117 84403
4298117
0
79621
0
79589
0
27
0
15
0
88746
0
44
0
86
0
0
0
0
0
0
0
3822051
0
0
0
307699
0

• The heap is 9 GB (as expected), but sbrk_top total is
14 GB (equal to RSS). And growing.

• Are there Gbyte-sized malloc()/free()s?
Memory: malloc() profiling
# dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p 17472
dtrace: description 'pid$target::malloc:entry ' matched 3 probes
^C
value ------------- Distribution ------------- count
2 |
0
4 |
3
8 |@
5927
16 |@@@@
41818
32 |@@@@@@@@@
81991
64 |@@@@@@@@@@@@@@@@@@
169888
128 |@@@@@@@
69891
256 |
2257
512 |
406
1024 |
893
2048 |
146
4096 |
1467
8192 |
755
16384 |
950
32768 |
83
65536 |
31
131072 |
11
262144 |
15
524288 |
0
1048576 |
1
2097152 |
0

• No huge malloc()s, but RSS continues to climb.

Recommended for you

Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014

LinuxCon Europe, 2014. Video: https://www.youtube.com/watch?v=SN7Z0eCn0VY . There are many performance tools nowadays for Linux, but how do they all fit together, and when do we use them? This talk summarizes the three types of performance tools: observability, benchmarking, and tuning, providing a tour of what exists and why they exist. Advanced tools including those based on tracepoints, kprobes, and uprobes are also included: perf_events, ktap, SystemTap, LTTng, and sysdig. You'll gain a good understanding of the performance tools landscape, knowing what to reach for to get the most out of your systems.

linux performance
Measuring the Performance of Single Page Applications
Measuring the Performance of Single Page ApplicationsMeasuring the Performance of Single Page Applications
Measuring the Performance of Single Page Applications

Single page applications are a problem for RUM tools because there are no easy ways to tell when a new page component has been requested asynchronously as a result of an intentional user action. Many network requests are back-end service calls initiated periodically by the app – for example, a ping to check if content has been updated, or to check if the current user should still be signed in to their account. Even with requests that are initiated by a user action, not all may fit into the definition of a “page view.” For example, a user typing into a search box that has auto-complete capabilities will often result in network requests, but these requests result in very small amounts of data transfer, happen very frequently, and do not count toward page views. The scene is further complicated by SPA frameworks like Angular, Backbone, and others. In this talk, we’ll learn about some of the tricks used by boomerang to measure the performance of single page applications, going as far as capturing errors and waterfall information across browsers.

mpulsenavigationtimingboomerang
Application Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance PuzzleApplication Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance Puzzle

The document outlines a methodology for Application Performance Management (APM). It discusses various components of an APM strategy including top-down monitoring, bottom-up monitoring, reporting and analytics, and aligning with ITIL processes. Top-down monitoring focuses on real-time application monitoring using techniques like synthetic transactions. Bottom-up monitoring ties into infrastructure monitoring tools. Reporting and analytics is used to analyze performance data and establish baselines. APM supports various ITIL processes like incident management, problem management and service level management.

real time application monitoringend user computinguser experience management
Memory: malloc() profiling
# dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p 17472
dtrace: description 'pid$target::malloc:entry ' matched 3 probes
^C
value ------------- Distribution ------------- count
2 |
0
4 |
3
8 |@
5927
16 |@@@@
41818
32 |@@@@@@@@@
81991
64 |@@@@@@@@@@@@@@@@@@
169888
128 |@@@@@@@
69891
256 |
2257
512 |
406
1024 |
893
2048 |
146
4096 |
1467
8192 |
755
16384 |
950
32768 |
83
65536 |
31
131072 |
11
262144 |
15
524288 |
0
1048576 |
1
2097152 |
0

This tool (one-liner)
profiles malloc()
request sizes

• No huge malloc()s, but RSS continues to climb.
Memory: heap growth
• Tracing why the heap grows via brk():
# dtrace -n 'syscall::brk:entry /execname == "beam.smp"/ { ustack(); }'
dtrace: description 'syscall::brk:entry ' matched 1 probe
CPU
ID
FUNCTION:NAME
10
18
brk:entry
libc.so.1`_brk_unlocked+0xa
libumem.so.1`vmem_sbrk_alloc+0x84
libumem.so.1`vmem_xalloc+0x669
libumem.so.1`vmem_alloc+0x14f
libumem.so.1`vmem_xalloc+0x669
libumem.so.1`vmem_alloc+0x14f
libumem.so.1`umem_alloc+0x72
libumem.so.1`malloc+0x59
libstdc++.so.6.0.14`_Znwm+0x20
libstdc++.so.6.0.14`_Znam+0x9
eleveldb.so`_ZN7leveldb9ReadBlockEPNS_16RandomAccessFileERKNS_11Rea...
eleveldb.so`_ZN7leveldb5Table11BlockReaderEPvRKNS_11ReadOptionsERKN...
eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator13InitDataBl...
eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator4SeekERKNS_5...
eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator4SeekERKNS_5...
eleveldb.so`_ZN7leveldb12_GLOBAL__N_115MergingIterator4SeekERKNS_5S...
eleveldb.so`_ZN7leveldb12_GLOBAL__N_16DBIter4SeekERKNS_5SliceE+0xcc
eleveldb.so`eleveldb_get+0xd3
beam.smp`process_main+0x6939
beam.smp`sched_thread_func+0x1cf
beam.smp`thr_wrapper+0xbe

This shows
the user-land
stack trace
for every
heap growth
Memory: heap growth
• More DTrace showed the size of the malloc()s
causing the brk()s:

# dtrace -x dynvarsize=4m -n '
pid$target::malloc:entry { self->size = arg0; }
syscall::brk:entry /self->size/ { printf("%d bytes", self->size); }
pid$target::malloc:return { self->size = 0; }' -p 17472
dtrace:
CPU
0
0

description 'pid$target::malloc:entry ' matched 7 probes
ID
FUNCTION:NAME
44
brk:entry 8343520 bytes
44
brk:entry 8343520 bytes

[...]

• These 8 Mbyte malloc()s grew the heap
• Even though the heap has Gbytes not in use
• This is starting to look like an OS issue
Memory: allocator internals
• More tools were created:
• Show memory entropy (+ malloc - free)
along with heap growth, over time

• Show codepath taken for allocations
compare successful with unsuccessful (heap growth)

• Show allocator internals: sizes, options, flags
• And run in the production environment
• Briefly; tracing frequent allocs does cost overhead
• Casting light into what was a black box

Recommended for you

An Introduction to Software Performance Engineering
An Introduction to Software Performance EngineeringAn Introduction to Software Performance Engineering
An Introduction to Software Performance Engineering

Software performance engineering is becoming increasingly important to businesses as they look to improve the non-functional performance of applications and get more out of IT investments. By leveraging performance engineering techniques, IT professionals can be indispensable in building and optimizing scalable systems. This introductory course will teach you the essentials of software performance engineering including : • The performance challenges faced by Enterprise IT today • What is software performance engineering (SPE)? • Best practices for building scalable software systems • The approaches to integrating SPE into IT project lifecycles • Common frameworks for measuring application performance and service levels • The impact of SPE on software developers, testers, capacity planes, and other IT professionals • Case studies from the finance, retail, and insurance industries Instructor: Walter Kuketz, SVP and CTO, Collaborative Consulting This training is sponsored by Correlsense, Collaborative Consulting, and New Horizons

information technologytechnologyinformation system
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)

Learn how Site24x7 gives you end-to-end application performance visibility for your Java, .NET and Ruby web transactions with metrics of all components starting from URLs to SQL queries.

site24x7 apmweblogicmonitoring web transactions
Effective Use Of Performance Analysis
Effective Use Of Performance AnalysisEffective Use Of Performance Analysis
Effective Use Of Performance Analysis

The document discusses effective use of performance analysis in coaching rugby. It provides examples of how performance analysis has evolved from basic notation to advanced digital tools for game analysis, technique analysis, and player tracking. It emphasizes using permanent records of performance to stimulate athlete learning in both training and competition environments. The coach's role is to develop a strategic, periodized approach to performance analysis to support long-term athlete development.

performanceanalysiscoachingrugbyunion
Memory: allocator internals
4
<- vmem_xalloc
0
4
-> _sbrk_grow_aligned
4096
4
<- _sbrk_grow_aligned
17155911680
4
-> vmem_xalloc
7356400
4
| vmem_xalloc:entry
umem_oversize
4
-> vmem_alloc
7356416
4
-> vmem_xalloc
7356416
4
| vmem_xalloc:entry
sbrk_heap
4
-> vmem_sbrk_alloc
7356416
4
-> vmem_alloc
7356416
4
-> vmem_xalloc
7356416
4
| vmem_xalloc:entry
sbrk_top
4
-> vmem_reap
16777216
4
<- vmem_reap
3178535181209758
4
| vmem_xalloc:return vmem_xalloc() == NULL, vm:
sbrk_top, size: 7356416, align: 4096, phase: 0, nocross: 0, min: 0, max: 0,
vmflag: 1
libumem.so.1`vmem_xalloc+0x80f
libumem.so.1`vmem_sbrk_alloc+0x33
libumem.so.1`vmem_xalloc+0x669
libumem.so.1`vmem_alloc+0x14f
libumem.so.1`vmem_xalloc+0x669
libumem.so.1`vmem_alloc+0x14f
libumem.so.1`umem_alloc+0x72
libumem.so.1`malloc+0x59
libstdc++.so.6.0.3`_Znwm+0x2b
libstdc++.so.6.0.3`_ZNSs4_Rep9_S_createEmmRKSaIcE+0x7e
Memory: solution
• These new tools and metrics pointed to the
allocation algorithm “instant fit”

• Someone had suggested this earlier; the tools provided solid evidence that this
really was the case here

• A new version of libumem was built to force use of
VM_BESTFIT

• and added by Robert Mustacchi as a tunable:
UMEM_OPTIONS=allocator=best

• Customer restarted Riak with new libumem version
• Problem solved
Memory: on the Cloud
• With OS virtualization, you can have:
Paging without scanning

• paging == swapping blocks with physical storage
• swapping == swapping entire threads between main
memory and physical storage

• Resource control paging is unrelated to the page
scanner, so, no vmstat scan rate (sr) despite
anonymous paging

• More new tools: DTrace sysinfo:::anonpgin by
process name, zonename
Memory: summary
• Superficial data available, detailed info not
• not by default
• Many new tools were created
• not easy, but made possible with DTrace

Recommended for you

A Modern Approach to Performance Monitoring
A Modern Approach to Performance MonitoringA Modern Approach to Performance Monitoring
A Modern Approach to Performance Monitoring

Akamai Edge 14 - Discussion on RUM, Synthetic and setting realistic and meaningful performance goals.

rumwebperfsoasta
Using dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceUsing dynaTrace to optimise application performance
Using dynaTrace to optimise application performance

The document discusses Nisa Retail's use of dynaTrace to improve service and cut costs. It provides an overview of Nisa Retail and Intechnica, and how dynaTrace was implemented at Nisa Retail to deliver business value. dynaTrace provided end-to-end application monitoring across all tiers, full transaction tracing, and proactive service level engineering to help optimize performance. This improved the user experience and helped Nisa Retail do more with less staff and budget.

software testingapplication performance managementperformance management
Dynatrace
DynatraceDynatrace
Dynatrace

Dynatrace is an APM solution that provides deep visibility into application performance across complex, distributed environments. It uses PurePath technology to capture timing and code-level context for all transactions end-to-end. This allows Dynatrace to identify performance issues and their root causes faster than other tools. Dynatrace can monitor Apache Tomcat servers and provide metrics on JVM performance, database queries, requests, and more. It helps diagnose common issues like inefficient database access, microservice problems, and coding issues.

monitoring toolpurnimadynatrace
Disk
Disk: problem
• Application performance issues
• Disks look busy (iostat)
• Blame the disks?
$ iostat -xnz 1
[...]
r/s
124.0

w/s
334.9

r/s
114.0

w/s
407.1

r/s
85.0

w/s
438.0

extended device statistics
kr/s
kw/s wait actv wsvc_t asvc_t %w %b device
15677.2 40484.9 0.0 1.0
0.0
2.2
1 69 c0t1d0
extended device statistics
kr/s
kw/s wait actv wsvc_t asvc_t %w %b device
14595.9 49409.1 0.0 0.8
0.0
1.5
1 56 c0t1d0
extended device statistics
kr/s
kw/s wait actv wsvc_t asvc_t %w %b device
10814.8 53242.1 0.0 0.8
0.0
1.6
1 57 c0t1d0

• Many graphical tools are built upon iostat
Disk: on the Cloud
• Tenants can’t see each other
• Maybe a neighbor is doing a backup?
• Maybe a neighbor is running a benchmark?
• Can’t see their processes (top/prstat)
• Blame what you can’t see
Disk: VFS
• Applications usually talk to a file system
• and are hurt by file system latency
• Disk I/O can be:
• unrelated to the application: asynchronous tasks
• inflated from what the application requested
• deflated

“ “

• blind to issues caused higher up the kernel stack

Recommended for you

Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools

Video: https://www.youtube.com/watch?v=uibLwoVKjec . Talk by Brendan Gregg for Sysdig CCWFS 2016. Abstract: "You have a system with an advanced programmatic tracer: do you know what to do with it? Brendan has used numerous tracers in production environments, and has published hundreds of tracing-based tools. In this talk he will share tips and know-how for creating CLI tracing tools and GUI visualizations, to solve real problems effectively. Programmatic tracing is an amazing superpower, and this talk will show you how to wield it!"

performance
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...

iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information. In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc. Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.

cassandra summit 2015datastax enterpriseavailable
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics

iland has built a global data warehouse across multiple data centers, collecting and aggregating data from core cloud services including compute, storage and network as well as chargeback and compliance. iland's warehouse brings actionable intelligence that customers can use to manipulate resources, analyze trends, define alerts and share information. In this session, we would like to present the lessons learned around Cassandra, both at the development and operations level, but also the technology and architecture we put in action on top of Cassandra such as Redis, syslog-ng, RabbitMQ, Java EE, etc. Finally, we would like to share insights on how we are currently extending our platform with Spark and Kafka and what our motivations are.

#cassandrasummit cloud analytics cassandra
Disk: issues with iostat(1)
• Unrelated:
• other applications / tenants
• file system prefetch
• file system dirty data fushing
• Inflated:
• rounded up to the next file system record size
• extra metadata for on-disk format
• read-modify-write of RAID5
Disk: issues with iostat(1)
• Deflated:
• read caching
• write buffering
• Blind:
• lock contention in the file system
• CPU usage by the file system
• file system software bugs
• file system queue latency
Disk: issues with iostat(1)
• blind (continued):
• disk cache flush latency (if your file system does it)
• file system I/O throttling latency
• I/O throttling is a new ZFS feature for cloud
environments

• adds artificial latency to file system I/O to throttle it
• added by Bill Pijewski and Jerry Jelenik of Joyent
Disk: file system latency
• Using DTrace to summarize ZFS read latency:
$ dtrace -n 'fbt::zfs_read:entry { self->start = timestamp; }
fbt::zfs_read:return /self->start/ {
@["ns"] = quantize(timestamp - self->start); self->start = 0; }'
dtrace: description 'fbt::zfs_read:entry ' matched 2 probes
^C
ns

value
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216

------------- Distribution ------------- count
|
0
|@
6
|@@
18
|@@@@@@@
79
|@@@@@@@@@@@@@@@@@
191
|@@@@@@@@@@
112
|@
14
|
1
|
1
|
0
|
0
|
0
|
0
|@@@
31
|@
9
|
0

Recommended for you

Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016

Talk for PerconaLive 2016 by Brendan Gregg. Video: https://www.youtube.com/watch?v=CbmEDXq7es0 . "Systems performance provides a different perspective for analysis and tuning, and can help you find performance wins for your databases, applications, and the kernel. However, most of us are not performance or kernel engineers, and have limited time to study this topic. This talk summarizes six important areas of Linux systems performance in 50 minutes: observability tools, methodologies, benchmarking, profiling, tracing, and tuning. Included are recipes for Linux performance analysis and tuning (using vmstat, mpstat, iostat, etc), overviews of complex areas including profiling (perf_events), static tracing (tracepoints), and dynamic tracing (kprobes, uprobes), and much advice about what is and isn't important to learn. This talk is aimed at everyone: DBAs, developers, operations, etc, and in any environment running Linux, bare-metal or the cloud."

Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix

The document summarizes a talk on container performance analysis. It discusses identifying bottlenecks at the host, container, and kernel level using various Linux performance tools. It also provides an overview of how containers work in Linux using namespaces and control groups (cgroups). Specifically, it demonstrates analyzing resource usage and limitations for containers using tools like docker stats, systemd-cgtop, and investigating namespaces.

dockercondockercon2017
From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container service

An overview of CERN container service, based on OpenStack Magnum, focusing on the new Docker Swatm Mode driver.

docker swarmopenstack
Disk: file system latency
• Using DTrace to summarize ZFS read latency:
$ dtrace -n 'fbt::zfs_read:entry { self->start = timestamp; }
fbt::zfs_read:return /self->start/ {
@["ns"] = quantize(timestamp - self->start); self->start = 0; }'
dtrace: description 'fbt::zfs_read:entry ' matched 2 probes
^C
ns

value
512
1024
2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216

------------- Distribution ------------- count
|
0
|@
6
|@@
18
|@@@@@@@
79
|@@@@@@@@@@@@@@@@@
191
|@@@@@@@@@@
112
|@
14
|
1
|
1
|
0
|
0
|
0
|
0
|@@@
31
|@
9
|
0

Cache reads

Disk reads
Disk: file system latency
• Tracing zfs events using zfsslower.d:
# ./zfsslower.d 10
TIME
2011 May 17 01:23:12
2011 May 17 01:23:13
2011 May 17 01:23:33
2011 May 17 01:23:33
2011 May 17 01:23:51
^C

PROCESS
mysqld
mysqld
mysqld
mysqld
httpd

D
R
W
W
W
R

KB
16
16
16
16
56

ms
19
10
11
10
14

FILE
/z01/opt/mysql5-64/data/xxxxx/xxxxx.ibd
/z01/var/mysql/xxxxx/xxxxx.ibd
/z01/var/mysql/xxxxx/xxxxx.ibd
/z01/var/mysql/xxxxx/xxxxx.ibd
/z01/home/xxxxx/xxxxx/xxxxx/xxxxx/xxxxx

• Argument is the minimum latency in milliseconds
Disk: file system latency
• Can trace this from other locations too:
• VFS layer: filter on desired file system types
• syscall layer: filter on file descriptors for file systems
• application layer: trace file I/O calls
Disk: file system latency
• And using SystemTap:
# ./vfsrlat.stp
Tracing... Hit Ctrl-C to end
^C
[..]
ext4 (ns):
value |-------------------------------------------------- count
256 |
0
512 |
0
1024 |
16
2048 |
17
4096 |
4
8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16321
16384 |
50
32768 |
1
65536 |
13
131072 |
0
262144 |
0

• Traces vfs.read to vfs.read.return, and gets the FS
type via: $file->f_path->dentry->d_inode->i_sb->s_type->name

• Warning: this script has crashed ubuntu/CentOS; I’m told RHEL is better

Recommended for you

Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production

This document discusses instrumenting and running Node.js applications in production environments. It describes how Node.js is well-suited for building "DIRTy" real-time web applications due to its asynchronous and event-driven architecture. The document advocates for using dynamic instrumentation tools like DTrace to measure latency in Node.js and visualize latency data through techniques like 4D heatmaps to debug performance issues.

Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale

Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.​

solrsolrclouddjug
Summit demystifying systemd1
Summit demystifying systemd1Summit demystifying systemd1
Summit demystifying systemd1

This document provides an overview of systemd and how it differs from traditional init systems. It discusses systemd units and how to manage services using systemctl. It covers customizing units using drop-ins, managing resources with cgroups, converting init scripts, and using the systemd journal. The presentation aims to demystify systemd and provide administrators with practical guidance on using its main features.

Disk: file system visualizations
• File system latency as a heat map (Cloud Analytics):

• This screenshot shows severe outliers
Disk: file system visualizations
• Sometimes the heat map is very surprising:

• This screenshot is from the Oracle ZFS Storage Appliance
Disk: summary
• Misleading data available
• New tools/metrics created
• Latency visualizations
Network

Recommended for you

Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005

This document summarizes a presentation about tuning parallel code on Solaris. It discusses: 1) Using tools like DTrace, prstat, and vmstat to analyze performance issues like thread scheduling and I/O problems in parallel applications on Solaris. 2) Two examples of using DTrace to analyze thread scheduling and troubleshoot I/O performance problems in a virtualized Windows server. 3) How the examples demonstrated using DTrace to identify unbalanced thread scheduling and discover that a domain controller was disabling disk write caching, slowing performance.

System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022

System Device Tree is an extension to Device Tree to describe all the hardware on an SoC, including heterogeneous CPU clusters and secure resources not typically visible to an Operating System like Linux. This full view allows the System Device Tree to be the "One true source" of the entire hardware description and helps to prevent the common (and hard-to-debug) problem of conflicting resources and system consistency. Lopper is an Open Source framework to parse and manipulate System Device Tree. With Lopper, it is possible to generate multiple traditional Device Trees from a single larger System Device Tree. This presentation will provide an overview of System Device Tree and will discuss the latest updates of the specification and tooling. The talk will illustrate multiple use-cases for System Device Tree with concrete examples, such as Linux running on the more powerful CPU cluster and Zephyr running on a smaller Cortex-R cluster. It will also show how to use Lopper to generate multiple traditional Device Trees targeting different OSes, not just Linux but also Zephyr/other RTOSes. Finally, an end-to-end demo based on Yocto to build a complete heterogeneous system with multiple OSes and RTOSes running on different clusters on a single reference board will be shown.

device treexenembedded
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...

Amazon EC2 provides a broad selection of instance types to accommodate a diverse mix of workloads. In this session, we provide an overview of the Amazon EC2 instance platform, key platform features, and the concept of instance generations. We dive into the current generation design choices of the different instance families, including the General Purpose, Compute Optimized, Storage Optimized, Memory Optimized, and Accelerated Computing (GPU and FPGA) instance families. We also detail best practices and share performance tips for getting the most out of your Amazon EC2 instances.

#ec2#srv402#aws
Network: problem
• TCP SYNs queue in-kernel until they are accept()ed
• The queue length is the TCP listen backlog
• may be set in listen()
• and limited by a system tunable (usually 128)
• on SmartOS: tcp_conn_req_max_q

• What if the queue remains full
• eg, application is overwhelmed with other work,
• or CPU starved
• ... and another SYN arrives?
Network: TCP listen drops
• Packet is dropped by the kernel
• fortunately a counter is bumped:
$ netstat -s | grep Drop
tcpTimRetransDrop
=
56
tcpTimKeepaliveProbe= 1594
tcpListenDrop
=3089298
tcpHalfOpenDrop
=
0
icmpOutDrops
=
0
sctpTimRetrans
=
0
sctpTimHearBeatProbe=
0
sctpListenDrop
=
0

tcpTimKeepalive
tcpTimKeepaliveDrop
tcpListenDropQ0
tcpOutSackRetrans
icmpOutErrors
sctpTimRetransDrop
sctpTimHearBeatDrop
sctpInClosed

= 2582
=
41
=
0
=1400832
=
0
=
0
=
0
=
0

• Remote host waits, and then retransmits
• TCP retransmit interval; usually 1 or 3 seconds
Network: predicting drops
• How do we know if we are close to dropping?
• An early warning
Network: tcpconnreqmaxq.d
• DTrace script traces drop events, if they occur:
# ./tcpconnreqmaxq.d
Tracing... Hit Ctrl-C to end.
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop
[...]

• ... and when Ctrl-C is hit:

cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504
cpid:11504

Recommended for you

PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World

This document discusses high availability for PostgreSQL in a containerized environment. It outlines typical enterprise requirements for high availability including recovery time objectives and recovery point objectives. Shared storage-based high availability is described as well as the advantages and disadvantages of PostgreSQL replication. The use of Linux containers and orchestration tools like Kubernetes and Consul for managing containerized PostgreSQL clusters is also covered. The document advocates for using PostgreSQL replication along with services and self-healing tools to provide highly available and scalable PostgreSQL deployments in modern container environments.

dbpostgresqlha
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G coreTối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core

1. The document discusses using OpenStack for a 4G core network, including performance issues and solutions when virtualizing the EPC network functions using OpenStack. 2. Key performance issues identified include high CPU usage, competing for CPU resources, latency, throughput, and packet loss. Solutions proposed are CPU pinning, NUMA awareness, hugepages, DPDK, SR-IOV, and offloading processing to smart NICs. 3. Going forward, the next steps discussed are using OVS-DPDK for offloading, SDN, containers, and cloud architectures for 5G.

4gopenstack
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud

There are many common workloads in R that are "embarrassingly parallel": group-by analyses, simulations, and cross-validation of models are just a few examples. In this talk I'll describe several techniques available in R to speed up workloads like these, by running multiple iterations simultaneously, in parallel. Many of these techniques require the use of a cluster of machines running R, and I'll provide examples of using cloud-based services to provision clusters for parallel computations. In particular, I will describe how you can use the SparklyR package to distribute data manipulations using the dplyr syntax, on a cluster of servers provisioned in the Azure cloud. Presented by David Smith at Data Day Texas in Austin, January 27 2018.

rhigh performance computingparallel programming
Network: tcpconnreqmaxq.d
tcp_conn_req_cnt_q distributions:
cpid:3063
value
-1
0
1

max_q:8
------------- Distribution ------------- count
|
0
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
|
0

cpid:11504
value
-1
0
1
2
4
8
16
32
64
128
256

max_q:128
------------- Distribution ------------- count
|
0
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
7279
|@@
405
|@
255
|@
138
|
81
|
83
|
62
|
67
|
34
|
0

tcpListenDrops:
cpid:11504

max_q:128

34
Network: tcpconnreqmaxq.d
tcp_conn_req_cnt_q distributions:
cpid:3063
value
-1
0
1

max_q:8
------------- Distribution ------------- count
|
0
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1
|
0

cpid:11504
value
-1
0
1
2
4
8
16
32
64
128
256

max_q:128
------------- Distribution ------------- count
|
0
|@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
7279
|@@
405
|@
255
|@
138
Length of queue
|
81
|
83
measured
|
62
on SYN event
|
67
|
34
|
0

tcpListenDrops:
cpid:11504

max_q:128

value
in
use

34
Network: tcplistendrop.d
• More details can be fetched as needed:
# ./tcplistendrop.d
TIME
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
2012 Jan 19 01:22:49
[...]

SRC-IP
10.17.210.103
10.17.210.108
10.17.210.116
10.17.210.117
10.17.210.112
10.17.210.106
10.12.143.16
10.17.210.100
10.17.210.99
10.17.210.98
10.17.210.101

PORT
25691
18423
38883
10739
27988
28824
65070
56392
24628
11686
34629

->
->
->
->
->
->
->
->
->
->
->

DST-IP
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212
192.192.240.212

PORT
80
80
80
80
80
80
80
80
80
80
80

• Just tracing the drop code-path
• Don’t need to pay the overhead of sniffing all packets
Network: DTrace code
• Key code from tcplistendrop.d:
fbt::tcp_input_listener:entry { self->mp = args[1]; }
fbt::tcp_input_listener:return { self->mp = 0; }
mib:::tcpListenDrop
/self->mp/
{
this->iph = (ipha_t *)self->mp->b_rptr;
this->tcph = (tcph_t *)(self->mp->b_rptr + 20);
printf("%-20Y %-18s %-5d -> %-18s %-5dn", walltimestamp,
inet_ntoa(&this->iph->ipha_src),
ntohs(*(uint16_t *)this->tcph->th_lport),
inet_ntoa(&this->iph->ipha_dst),
ntohs(*(uint16_t *)this->tcph->th_fport));
}

• This uses the unstable interface fbt provider
• a stable tcp provider now exists, which is better for
more common tasks - like connections by IP

Recommended for you

Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing

Vinetalk is a software abstraction layer that allows cluster managers like Mesos and Kubernetes to offer fractions of GPU resources, enabling more efficient sharing of accelerators. Existing cluster managers cannot share accelerators because device drivers do not support it. Vinetalk implements an abstraction layer that decouples executors from vendor-specific drivers, representing accelerators as virtual access queues. This allows multiple tasks to concurrently use the same physical accelerator. Vinetalk has been shown to reduce queuing times for tasks sharing a GPU compared to Mesos alone. It also easier for developers to use, hiding proprietary device APIs, and has low overhead of 1-5% due to memory transfers.

energy efficiencydata centersfpga
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt

This document provides an introduction to CUDA programming. It discusses the programmer's view of the GPU as a co-processor with its own memory, and how GPUs are well-suited for data-parallel applications with many independent computations. It describes how CUDA uses a grid of blocks of threads to run kernels in parallel. Memory is organized into global, constant, shared, and local memory. Kernels launch a grid of blocks, and threads within blocks can cooperate through shared memory and synchronization.

Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster

Learn the basics of monitoring Scylla, including monitoring infrastructure and understanding Scylla metrics.

nosqldatabaselinux
Network: summary
• For TCP, while many counters are available, they are
system wide integers

• Custom tools can show more details
• addresses and ports
• kernel state
• needs kernel access and dynamic tracing
Data Recap
• Problem types
• CPU utilization	

scaleability

• CPU usage	

	

scaleability

• CPU latency	

	

observability

• Memory	

	

	

observability

• Disk	 	

	

	

observability

• Network	

	

	

observability
Data Recap
• Problem types, solution types
• CPU utilization	 scaleability
• CPU usage	

	

scaleability

• CPU latency	

	

observability

• Memory	

	

	

observability

• Disk	 	

	

	

observability

• Network	

	

	

observability

visualizations

metrics
Theory

Recommended for you

Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads

The increasing complexity of learning algorithms and deep neural networks, combined with size of data and parameters, has made it challenging to exploit existing large-scale data processing pipelines for training and inference. Approaches are outlined for preprocessing, training, inference, and deployment across datasets that leverage Spark, its extended ecosystem of libraries, and deep learning frameworks.

 
by S N
machine learningdeep learningcloudera
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals

This document provides an overview of Oracle performance tuning fundamentals. It discusses key concepts like wait events, statistics, CPU utilization, and the importance of understanding the operating system, database, and business needs. It also introduces tools for monitoring performance like AWR, ASH, and dynamic views. The goal is to establish a foundational understanding of Oracle performance concepts and monitoring techniques.

oracle performance
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance

This document discusses tuning Oracle GoldenGate for optimal performance. It begins with an overview of GoldenGate architecture and use cases, then discusses the importance of baseline monitoring. Key metrics to monitor are identified as lag times, checkpoint information, CPU usage, memory usage, and disk I/O. The document provides examples of commands to gather baseline data on these metrics. It then discusses configuring GoldenGate for parallel processing using multiple process groups to optimize performance. Overall it provides guidance on setting baselines and configuring GoldenGate to minimize lag times and resource utilization.

Performance Analysis
• Goals
• Capacity planning
• Issue analysis
Performance Issues
• Strategy
• Step 1: is there a problem?
• Step 2: which subsystem/team is responsible?
• Difficult to get past these steps without reliable
metrics
Problem Space
• Myths
• Vendors provide good metrics with good coverage
• The problem is to line-graph them
• Realities
• Metrics can be wrong, incomplete and misleading,
requiring time and expertise to interpret

• Line graphs can hide issues
Problem Space
• Cloud computing confuses matters further:
• hiding metrics from neighbors
• throttling performance due to invisible neighbors

Recommended for you

IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking

The document discusses challenges with processor benchmarking and provides recommendations. It summarizes a case study where a popular CPU benchmark claimed a new processor was 2.6x faster than Intel, but detailed analysis found the benchmark was testing division speed, which accounted for only 0.1% of cycles on Netflix servers. The document advocates for low-level, active benchmarking and profiling over statistical analysis. It also provides a checklist for evaluating benchmarks and cautions that increased processor complexity and cloud environments make accurate benchmarking more difficult.

processorscpusbenchmarking
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)

This document provides an overview of using eBPF (extended Berkeley Packet Filter) to quickly get performance wins as a sysadmin. It recommends installing BCC and bpftrace tools to easily find issues like periodic processes, misconfigurations, unexpected TCP sessions, or slow file system I/O. A case study examines using biosnoop to identify which processes were causing disk latency issues. The document suggests thinking like a sysadmin first by running tools, then like a programmer if a problem requires new tools. It also outlines recommended frontends depending on use cases and provides references to learn more about BPF.

bpfebpfperformance
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started

Talk for Facebook Systems@Scale 2021 by Brendan Gregg: "BPF (eBPF) tracing is the superpower that can analyze everything, helping you find performance wins, troubleshoot software, and more. But with many different front-ends and languages, and years of evolution, finding the right starting point can be hard. This talk will make it easy, showing how to install and run selected BPF tools in the bcc and bpftrace open source projects for some quick wins. Think like a sysadmin, not like a programmer."

linuxbpfperformance
Example Problems
• Included:
• Understanding utilization across 5,312 CPUs
• Using disk I/O metrics to explain application
performance

• A lack of metrics for memory growth, packet drops, ...
Example Solutions: tools
• Device utilization heat maps for CPUs
• Flame graphs for CPU profiling
• CPU dispatcher queue latency by zone
• CPU caps latency by zone
• malloc() size profiling
• Heap growth stack backtraces
• File system latency distributions
• File system latency tracing
• TCP accept queue length distribution
• TCP listen drop tracing with details
Key Concepts
• Visualizations
• heat maps for device utilization and latency
• flame graphs
• Custom metrics often necessary
• Latency-based for issue analysis
• If coding isn’t practical/timely, use dynamic tracing
• Cloud Computing
• Provide observability (often to show what the problem isn’t)
• Develop new metrics for resource control effects
DTrace
• Many problems were only solved thanks to DTrace
• In the SmartOS cloud environment:
• The compute node (global zone) can DTrace
everything (except for KVM guests, for which it has a
limited view: resource I/O + some MMU events, so far)

• SmartMachines (zones) have the DTrace syscall,
profile (their user-land only), pid and USDT providers

• Joyent Cloud Analytics uses DTrace from the global
zone to give extended details to customers

Recommended for you

Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)

Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."

performance
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)

USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.

bpfebpflinux
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started

Keynote by Brendan Gregg for the eBPF summit, 2020. How to get started finding performance wins using the BPF (eBPF) technology. This short talk covers the quickest and easiest way to find performance wins using BPF observability tools on Linux.

bpfebpflinux
Performance
• The more you know, the more you don’t
• Hopefully I’ve turned some unknown-unknowns
into known-unknowns
Thank you
• Resources:
• http://dtrace.org/blogs/brendan
• More CPU utilization visualizations:

http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/

• Flame Graphs: http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/
and http://github.com/brendangregg/FlameGraph

• More iostat(1) & file system latency discussion:

http://dtrace.org/blogs/brendan/tag/filesystem-2/

• Cloud Analytics:
• OSCON slides: http://dtrace.org/blogs/dap/files/2011/07/ca-oscon-data.pdf
• Joyent: http://joyent.com

• brendan@joyent.com

More Related Content

What's hot

Lisa12 methodologies
Lisa12 methodologiesLisa12 methodologies
Lisa12 methodologies
Brendan Gregg
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg
 
JavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame GraphsJavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame Graphs
Brendan Gregg
 
ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016
Brendan Gregg
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
Brendan Gregg
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
Brendan Gregg
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
Brendan Gregg
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
Brendan Gregg
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs
Brendan Gregg
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Brendan Gregg
 
Systems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudSystems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the Cloud
Brendan Gregg
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
Ed Hunter
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Amazon Web Services
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
Brendan Gregg
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction
Brendan Gregg
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
Brendan Gregg
 
EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
Brendan Gregg
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
Brendan Gregg
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
Brendan Gregg
 

What's hot (20)

Lisa12 methodologies
Lisa12 methodologiesLisa12 methodologies
Lisa12 methodologies
 
USENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
 
JavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame GraphsJavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame Graphs
 
ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
SREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Systems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the CloudSystems Performance: Enterprise and the Cloud
Systems Performance: Enterprise and the Cloud
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
YOW2021 Computing Performance
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
EuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 

Viewers also liked

Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
Brendan Gregg
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
Brendan Gregg
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
Brendan Gregg
 
Real-time in the real world: DIRT in production
Real-time in the real world: DIRT in productionReal-time in the real world: DIRT in production
Real-time in the real world: DIRT in production
bcantrill
 
HTTP Application Performance Analysis
HTTP Application Performance AnalysisHTTP Application Performance Analysis
HTTP Application Performance Analysis
PerformanceVision (previously SecurActive)
 
Performance analysis 2013
Performance analysis 2013Performance analysis 2013
Performance analysis 2013
Kerry Harrison
 
Web performance optimization (WPO)
Web performance optimization (WPO)Web performance optimization (WPO)
Web performance optimization (WPO)
Mariusz Kaczmarek
 
Approaches to Software Testing
Approaches to Software TestingApproaches to Software Testing
Approaches to Software Testing
Scott Barber
 
DTraceCloud2012
DTraceCloud2012DTraceCloud2012
DTraceCloud2012
Brendan Gregg
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
Brendan Gregg
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014
Brendan Gregg
 
Measuring the Performance of Single Page Applications
Measuring the Performance of Single Page ApplicationsMeasuring the Performance of Single Page Applications
Measuring the Performance of Single Page Applications
Nicholas Jansma
 
Application Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance PuzzleApplication Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance Puzzle
LDragich
 
An Introduction to Software Performance Engineering
An Introduction to Software Performance EngineeringAn Introduction to Software Performance Engineering
An Introduction to Software Performance Engineering
Correlsense
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)
Site24x7
 
Effective Use Of Performance Analysis
Effective Use Of Performance AnalysisEffective Use Of Performance Analysis
Effective Use Of Performance Analysis
Keith Lyons
 
A Modern Approach to Performance Monitoring
A Modern Approach to Performance MonitoringA Modern Approach to Performance Monitoring
A Modern Approach to Performance Monitoring
Cliff Crocker
 
Using dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceUsing dynaTrace to optimise application performance
Using dynaTrace to optimise application performance
Richard Bishop
 
Dynatrace
DynatraceDynatrace
Dynatrace
Purnima Kurella
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
Brendan Gregg
 

Viewers also liked (20)

Performance Analysis: The USE Method
Performance Analysis: The USE MethodPerformance Analysis: The USE Method
Performance Analysis: The USE Method
 
Linux Performance Analysis and Tools
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 
Real-time in the real world: DIRT in production
Real-time in the real world: DIRT in productionReal-time in the real world: DIRT in production
Real-time in the real world: DIRT in production
 
HTTP Application Performance Analysis
HTTP Application Performance AnalysisHTTP Application Performance Analysis
HTTP Application Performance Analysis
 
Performance analysis 2013
Performance analysis 2013Performance analysis 2013
Performance analysis 2013
 
Web performance optimization (WPO)
Web performance optimization (WPO)Web performance optimization (WPO)
Web performance optimization (WPO)
 
Approaches to Software Testing
Approaches to Software TestingApproaches to Software Testing
Approaches to Software Testing
 
DTraceCloud2012
DTraceCloud2012DTraceCloud2012
DTraceCloud2012
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014
 
Measuring the Performance of Single Page Applications
Measuring the Performance of Single Page ApplicationsMeasuring the Performance of Single Page Applications
Measuring the Performance of Single Page Applications
 
Application Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance PuzzleApplication Performance Management - Solving the Performance Puzzle
Application Performance Management - Solving the Performance Puzzle
 
An Introduction to Software Performance Engineering
An Introduction to Software Performance EngineeringAn Introduction to Software Performance Engineering
An Introduction to Software Performance Engineering
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)
 
Effective Use Of Performance Analysis
Effective Use Of Performance AnalysisEffective Use Of Performance Analysis
Effective Use Of Performance Analysis
 
A Modern Approach to Performance Monitoring
A Modern Approach to Performance MonitoringA Modern Approach to Performance Monitoring
A Modern Approach to Performance Monitoring
 
Using dynaTrace to optimise application performance
Using dynaTrace to optimise application performanceUsing dynaTrace to optimise application performance
Using dynaTrace to optimise application performance
 
Dynatrace
DynatraceDynatrace
Dynatrace
 
Designing Tracing Tools
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
 

Similar to Performance Analysis: new tools and concepts from the cloud

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
DataStax Academy
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Julien Anguenot
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
Brendan Gregg
 
Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
Docker, Inc.
 
From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container service
Spyros Trigazis
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
bcantrill
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
Summit demystifying systemd1
Summit demystifying systemd1Summit demystifying systemd1
Summit demystifying systemd1
Susant Sahani
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
dflexer
 
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
Stefano Stabellini
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
Amazon Web Services
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
Jignesh Shah
 
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G coreTối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Vietnam Open Infrastructure User Group
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
Revolution Analytics
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
VINEYARD - Versatile Integrated Accelerator-based Heterogeneous Data Centres
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt
ceyifo9332
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
ScyllaDB
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads
S N
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
Carlos Sierra
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
Enkitec
 

Similar to Performance Analysis: new tools and concepts from the cloud (20)

iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Container Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
 
From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container service
 
Instrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in productionInstrumenting the real-time web: Node.js in production
Instrumenting the real-time web: Node.js in production
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Summit demystifying systemd1
Summit demystifying systemd1Summit demystifying systemd1
Summit demystifying systemd1
 
Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005Tuning parallelcodeonsolaris005
Tuning parallelcodeonsolaris005
 
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022System Device Tree and Lopper: Concrete Examples - ELC NA 2022
System Device Tree and Lopper: Concrete Examples - ELC NA 2022
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
 
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G coreTối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
 
002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt002 - Introduction to CUDA Programming_1.ppt
002 - Introduction to CUDA Programming_1.ppt
 
Performance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla ClusterPerformance Monitoring: Understanding Your Scylla Cluster
Performance Monitoring: Understanding Your Scylla Cluster
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads
 
Oracle Performance Tuning Fundamentals
Oracle Performance Tuning FundamentalsOracle Performance Tuning Fundamentals
Oracle Performance Tuning Fundamentals
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 

More from Brendan Gregg

IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
Brendan Gregg
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
Brendan Gregg
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
Brendan Gregg
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
Brendan Gregg
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
Brendan Gregg
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
Brendan Gregg
 
LPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
Brendan Gregg
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
Brendan Gregg
 
YOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
Brendan Gregg
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
Brendan Gregg
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
Brendan Gregg
 
NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
Brendan Gregg
 
FlameScope 2018
FlameScope 2018FlameScope 2018
FlameScope 2018
Brendan Gregg
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
Brendan Gregg
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
Brendan Gregg
 

More from Brendan Gregg (20)

IntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
 
Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
 
Systems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
LPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
YOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
 
eBPF Perf Tools 2019
eBPF Perf Tools 2019eBPF Perf Tools 2019
eBPF Perf Tools 2019
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
 
FlameScope 2018
FlameScope 2018FlameScope 2018
FlameScope 2018
 
ATO Linux Performance 2018
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 

Recently uploaded

How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
HackersList
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
Stephanie Beckett
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
Enterprise Wired
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
Yevgen Sysoyev
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
Awais Yaseen
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Mydbops
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
jackson110191
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
SynapseIndia
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
Sally Laouacheria
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
Lidia A.
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
Bert Blevins
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
Matthew Sinclair
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
Stephanie Beckett
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Chris Swan
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
Tatiana Al-Chueyr
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
Larry Smarr
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
Matthew Sinclair
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
SynapseIndia
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 

Recently uploaded (20)

How Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdfHow Social Media Hackers Help You to See Your Wife's Message.pdf
How Social Media Hackers Help You to See Your Wife's Message.pdf
 
What's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptxWhat's New in Copilot for Microsoft365 May 2024.pptx
What's New in Copilot for Microsoft365 May 2024.pptx
 
7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf7 Most Powerful Solar Storms in the History of Earth.pdf
7 Most Powerful Solar Storms in the History of Earth.pdf
 
DealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 editionDealBook of Ukraine: 2024 edition
DealBook of Ukraine: 2024 edition
 
Best Programming Language for Civil Engineers
Best Programming Language for Civil EngineersBest Programming Language for Civil Engineers
Best Programming Language for Civil Engineers
 
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - MydbopsScaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
Scaling Connections in PostgreSQL Postgres Bangalore(PGBLR) Meetup-2 - Mydbops
 
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdfINDIAN AIR FORCE FIGHTER PLANES LIST.pdf
INDIAN AIR FORCE FIGHTER PLANES LIST.pdf
 
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptxRPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
RPA In Healthcare Benefits, Use Case, Trend And Challenges 2024.pptx
 
20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf20240702 Présentation Plateforme GenAI.pdf
20240702 Présentation Plateforme GenAI.pdf
 
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-InTrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
TrustArc Webinar - 2024 Data Privacy Trends: A Mid-Year Check-In
 
WPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide DeckWPRiders Company Presentation Slide Deck
WPRiders Company Presentation Slide Deck
 
Password Rotation in 2024 is still Relevant
Password Rotation in 2024 is still RelevantPassword Rotation in 2024 is still Relevant
Password Rotation in 2024 is still Relevant
 
20240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 202420240705 QFM024 Irresponsible AI Reading List June 2024
20240705 QFM024 Irresponsible AI Reading List June 2024
 
What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024What’s New in Teams Calling, Meetings and Devices May 2024
What’s New in Teams Calling, Meetings and Devices May 2024
 
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
Fluttercon 2024: Showing that you care about security - OpenSSF Scorecards fo...
 
Best Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdfBest Practices for Effectively Running dbt in Airflow.pdf
Best Practices for Effectively Running dbt in Airflow.pdf
 
The Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU CampusesThe Increasing Use of the National Research Platform by the CSU Campuses
The Increasing Use of the National Research Platform by the CSU Campuses
 
20240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 202420240704 QFM023 Engineering Leadership Reading List June 2024
20240704 QFM023 Engineering Leadership Reading List June 2024
 
How RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptxHow RPA Help in the Transportation and Logistics Industry.pptx
How RPA Help in the Transportation and Logistics Industry.pptx
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 

Performance Analysis: new tools and concepts from the cloud

  • 1. Performance Analysis: new tools and concepts from the cloud Brendan Gregg Lead Performance Engineer, Joyent brendan.gregg@joyent.com SCaLE10x Jan, 2012
  • 2. whoami • I do performance analysis • I also write performance tools out of necessity • Was Brendan @ Sun Microsystems, Oracle, now Joyent
  • 3. Joyent • Cloud computing provider • Cloud computing software • SmartOS • host OS, and guest via OS virtualization • Linux, Windows • guest via KVM
  • 4. Agenda • Data • Example problems & solutions • How cloud environments complicate performance • Theory • Performance analysis • Summarize new tools & concepts • This talk uses SmartOS and DTrace to illustrate concepts that are applicable to most OSes.
  • 5. Data • Example problems: • CPU • Memory • Disk • Network • Some have neat solutions, some messy, some none • This is real world • Some I’ve covered before, some I haven’t
  • 6. CPU
  • 7. CPU utilization: problem • Would like to identify: • single or multiple CPUs at 100% utilization • average, minimum and maximum CPU utilization • CPU utilization balance (tight or loose distribution) • time-based characteristics changing/bursting? burst interval, burst length • For small to large environments • entire datacenters or clouds
  • 8. CPU utilization • mpstat(1) has the data. 1 second, 1 server (16 CPUs):
  • 9. CPU utilization • Scaling to 60 seconds, 1 server:
  • 10. CPU utilization • Scaling to entire datacenter, 60 secs, 5312 CPUs:
  • 11. CPU utilization • Line graphs can solve some problems: • x-axis: time, 60 seconds • y-axis: utilization
  • 12. CPU utilization • ... but don’t scale well to individual devices • 5312 CPUs, each as a line:
  • 13. CPU utilization • Pretty, but scale limited as well:
  • 14. CPU utilization • Utilization as a heat map: • x-axis: time, y-axis: utilization • z-axis (color): number of CPUs
  • 15. CPU utilization • Available in Cloud Analytics (Joyent) • Clicking highlights and shows details; eg, hostname:
  • 16. CPU utilization • Utilization heat map also suitable and used for: • disks • network interfaces • Utilization as a metric can be a bit misleading • really a percent busy over a time interval • devices may accept more work at 100% busy • may not directly relate to performance impact
  • 17. CPU utilization: summary • Data readily available • Using a new visualization
  • 18. CPU usage • Given a CPU is hot, what is it doing? • Beyond just vmstat’s usr/sys ratio • Profiling (sampling at an interval) the program counter or stack back trace • user-land stack for %usr • kernel stack for %sys • Many tools can do this to some degree • Developer Studios/DTrace/oprofile/...
  • 19. CPU usage: profiling • Frequency count on-CPU user-land stack traces: # dtrace -x ustackframes=100 -n 'profile-997 /execname == "mysqld"/ { @[ustack()] = count(); } tick-60s { exit(0); }' dtrace: description 'profile-997 ' matched 2 probes CPU ID FUNCTION:NAME 1 75195 :tick-60s [...] libc.so.1`__priocntlset+0xa libc.so.1`getparam+0x83 libc.so.1`pthread_getschedparam+0x3c libc.so.1`pthread_setschedprio+0x1f mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x9ab mysqld`_Z10do_commandP3THD+0x198 mysqld`handle_one_connection+0x1a6 libc.so.1`_thrp_setup+0x8d libc.so.1`_lwp_start 4884 mysqld`_Z13add_to_statusP17system_status_varS0_+0x47 mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67 mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222 mysqld`_Z10do_commandP3THD+0x198 mysqld`handle_one_connection+0x1a6 libc.so.1`_thrp_setup+0x8d libc.so.1`_lwp_start 5530
  • 20. CPU usage: profiling • Frequency count on-CPU user-land stack traces: # dtrace -x ustackframes=100 -n 'profile-997 /execname == "mysqld"/ { @[ustack()] = count(); } tick-60s { exit(0); }' dtrace: description 'profile-997 ' matched 2 probes CPU ID FUNCTION:NAME 1 75195 :tick-60s [...] libc.so.1`__priocntlset+0xa libc.so.1`getparam+0x83 libc.so.1`pthread_getschedparam+0x3c libc.so.1`pthread_setschedprio+0x1f mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x9ab mysqld`_Z10do_commandP3THD+0x198 mysqld`handle_one_connection+0x1a6 libc.so.1`_thrp_setup+0x8d libc.so.1`_lwp_start 4884 Over 500,000 lines truncated mysqld`_Z13add_to_statusP17system_status_varS0_+0x47 mysqld`_Z22calc_sum_of_all_statusP17system_status_var+0x67 mysqld`_Z16dispatch_command19enum_server_commandP3THDPcj+0x1222 mysqld`_Z10do_commandP3THD+0x198 mysqld`handle_one_connection+0x1a6 libc.so.1`_thrp_setup+0x8d libc.so.1`_lwp_start 5530
  • 22. CPU usage: visualization • Visualized as a “Flame Graph”:
  • 23. CPU usage: Flame Graphs • Just some Perl that turns DTrace output into an interactive SVG: mouse-over elements for details • It’s on github • http://github.com/brendangregg/FlameGraph • Works on kernel stacks, and both user+kernel • Shouldn’t be hard to have it process oprofile, etc.
  • 24. CPU usage: on the Cloud • Flame Graphs were born out of necessity on Cloud environments: • Perf issues need quick resolution (you just got hackernews’d) • Everyone is running different versions of everything (don’t assume you’ve seen the last of old CPU-hot code-path issues that have been fixed)
  • 25. CPU usage: summary • Data can be available • For cloud computing: easy for operators to fetch on OS virtualized environments; otherwise agent driven, and possibly other difficulties (access to CPU instrumentation counter-based interrupts) • Using a new visualization
  • 26. CPU latency • CPU dispatcher queue latency • thread is ready-to-run, and waiting its turn • Observable in coarse ways: • vmstat’s r • high load averages • Less course, with microstate accounting • prstat -mL’s LAT • How much is it affecting application performance?
  • 27. CPU latency: zonedispqlat.d • Using DTrace to trace kernel scheduler events: #./zonedisplat.d Tracing... Note: outliers (> 1 secs) may be artifacts due to the use of scalar globals (sorry). CPU disp queue latency by zone (ns): dbprod-045 value 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 [...] ------------- Distribution ------------- count | 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 10210 |@@@@@@@@@@ 3829 |@ 514 | 94 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0
  • 28. CPU latency: zonedispqlat.d • CPU dispatcher queue latency by zonename (zonedispqlat.d), work in progress: #!/usr/sbin/dtrace -s #pragma D option quiet dtrace:::BEGIN { printf("Tracing...n"); printf("Note: outliers (> 1 secs) may be artifacts due to the "); printf("use of scalar globals (sorry).nn"); } sched:::enqueue { /* scalar global (I don't think this can be thread local) */ start[args[0]->pr_lwpid, args[1]->pr_pid] = timestamp; } sched:::dequeue /this->start = start[args[0]->pr_lwpid, args[1]->pr_pid]/ { this->time = timestamp - this->start; /* workaround since zonename isn't a member of args[1]... */ this->zone = ((proc_t *)args[1]->pr_addr)->p_zone->zone_name; @[stringof(this->zone)] = quantize(this->time); start[args[0]->pr_lwpid, args[1]->pr_pid] = 0; } tick-1sec { printf("CPU disp queue latency by zone (ns):n"); printa(@); trunc(@); } Save timestamp on enqueue; calculate delta on dequeue
  • 29. CPU latency: zonedispqlat.d • Instead of zonename, this could be process name, ... • Tracing scheduler enqueue/dequeue events and saving timestamps costs CPU overhead • they are frequent • I’d prefer to only trace dequeue, and reuse the existing microstate accounting timestamps • but one problem is a clash between unscaled and scaled timestamps
  • 30. CPU latency: on the Cloud • With virtualization, you can have: high CPU latency with idle CPUs due to an instance consuming their quota • OS virtualization • not visible in vmstat r • is visible as part of prstat -mL’s LAT • more kstats recently added to SmartOS including nsec_waitrq (total run queue wait by zone) • Hardware virtualization • vmstat st (stolen)
  • 31. CPU latency: caps • CPU cap latency from the host (zonecapslat.d): #!/usr/sbin/dtrace -s #pragma D option quiet sched:::cpucaps-sleep { start[args[0]->pr_lwpid, args[1]->pr_pid] = timestamp; } sched:::cpucaps-wakeup /this->start = start[args[0]->pr_lwpid, args[1]->pr_pid]/ { this->time = timestamp - this->start; /* workaround since zonename isn't a member of args[1]... */ this->zone = ((proc_t *)args[1]->pr_addr)->p_zone->zone_name; @[stringof(this->zone)] = quantize(this->time); start[args[0]->pr_lwpid, args[1]->pr_pid] = 0; } tick-1sec { printf("CPU caps latency by zone (ns):n"); printa(@); trunc(@); }
  • 32. CPU latency: summary • Partial data available • New tools/metrics created • although current DTrace solutions have overhead; we should be able to improve that • although, new kstats may be sufficient
  • 34. Memory: problem • Riak database has endless memory growth. • expected 9GB, after two days: $ prstat -c 1 Please wait... PID USERNAME SIZE RSS STATE 21722 103 43G 40G cpu0 15770 root 7760K 540K sleep 95 root 0K 0K sleep 12827 root 128M 73M sleep 10319 bgregg 10M 6788K sleep 10402 root 22M 288K sleep [...] PRI NICE 59 0 57 0 99 -20 100 59 0 59 0 TIME 72:23:41 23:28:57 7:37:47 0:49:36 0:00:00 0:18:45 CPU 2.6% 0.9% 0.2% 0.1% 0.0% 0.0% PROCESS/NLWP beam.smp/594 zoneadmd/5 zpool-zones/166 node/5 sshd/1 dtrace/1 • Eventually hits paging and terrible performance • needing a restart • Is this a memory leak? Or application growth?
  • 35. Memory: scope • Identify the subsystem and team responsible Subsystem Team Application Voxer Riak Basho Erlang Ericsson SmartOS Joyent
  • 36. Memory: heap profiling • What is in the heap? $ pmap 14719 14719: beam.smp 0000000000400000 000000000062D000 000000000067F000 00000001005C0000 00000002005BE000 0000000300382000 00000004002E2000 00000004FFFD3000 00000005FFF91000 00000006FFF4C000 00000007FF9EF000 [...] 2168K 328K 4193540K 4194296K 4192016K 4193664K 4191172K 4194040K 4194028K 4188812K 588224K r-x-rw--rw--rw--rw--rw--rw--rw--rw--rw--rw--- /opt/riak/erts-5.8.5/bin/beam.smp /opt/riak/erts-5.8.5/bin/beam.smp /opt/riak/erts-5.8.5/bin/beam.smp [ anon ] [ anon ] [ anon ] [ anon ] [ anon ] [ anon ] [ anon ] [ heap ] • ... and why does it keep growing? • Would like to answer these in production • Without restarting apps. Experimentation (backend=mmap, other allocators) wasn’t working.
  • 37. Memory: heap profiling • libumem was used for multi-threaded performance • libumem == user-land slab allocator • detailed observability can be enabled, allowing heap profiling and leak detection • While designed with speed and production use in mind, it still comes with some cost (time and space), and aren’t on by default. • UMEM_DEBUG=audit
  • 38. Memory: heap profiling • libumem provides some default observability • Eg, slabs: > ::umem_malloc_info CACHE BUFSZ MAXMAL BUFMALLC 0000000000707028 8 0 0 000000000070b028 16 8 8730 000000000070c028 32 16 8772 000000000070f028 48 32 1148038 0000000000710028 64 48 344138 0000000000711028 80 64 36 0000000000714028 96 80 8934 0000000000715028 112 96 1347040 0000000000718028 128 112 253107 000000000071a028 160 144 40529 000000000071b028 192 176 140 000000000071e028 224 208 43 000000000071f028 256 240 133 0000000000720028 320 304 56 0000000000723028 384 368 35 [...] AVG_MAL 0 8 16 25 40 62 79 87 111 118 155 188 229 276 335 MALLOCED 0 69836 140352 29127788 13765658 2226 705348 117120208 28011923 4788681 21712 8101 30447 15455 11726 OVERHEAD 0 1054998 1130491 156179051 58417287 4806 1168558 190389780 42279506 6466801 25818 6497 26211 12276 7220 %OVER 0.0% 1510.6% 805.4% 536.1% 424.3% 215.9% 165.6% 162.5% 150.9% 135.0% 118.9% 80.1% 86.0% 79.4% 61.5%
  • 39. Memory: heap profiling • ... and heap (captured @14GB RSS): > ::vmem ADDR NAME fffffd7ffebed4a0 sbrk_top fffffd7ffebee0a8 sbrk_heap fffffd7ffebeecb0 vmem_internal fffffd7ffebef8b8 vmem_seg fffffd7ffebf04c0 vmem_hash fffffd7ffebf10c8 vmem_vmem 00000000006e7000 umem_internal 00000000006e8000 umem_cache 00000000006e9000 umem_hash 00000000006ea000 umem_log 00000000006eb000 umem_firewall_va 00000000006ec000 umem_firewall 00000000006ed000 umem_oversize 00000000006f0000 umem_memalign 0000000000706000 umem_default INUSE 9090404352 9090404352 664616960 651993088 12583424 46200 352862464 113696 13091328 0 0 0 5218777974 0 2552131584 TOTAL 14240165888 9090404352 664616960 651993088 12587008 55344 352866304 180224 13099008 0 0 0 5520789504 0 2552131584 SUCCEED FAIL 4298117 84403 4298117 0 79621 0 79589 0 27 0 15 0 88746 0 44 0 86 0 0 0 0 0 0 0 3822051 0 0 0 307699 0 • The heap is 9 GB (as expected), but sbrk_top total is 14 GB (equal to RSS). And growing. • Are there Gbyte-sized malloc()/free()s?
  • 40. Memory: malloc() profiling # dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p 17472 dtrace: description 'pid$target::malloc:entry ' matched 3 probes ^C value ------------- Distribution ------------- count 2 | 0 4 | 3 8 |@ 5927 16 |@@@@ 41818 32 |@@@@@@@@@ 81991 64 |@@@@@@@@@@@@@@@@@@ 169888 128 |@@@@@@@ 69891 256 | 2257 512 | 406 1024 | 893 2048 | 146 4096 | 1467 8192 | 755 16384 | 950 32768 | 83 65536 | 31 131072 | 11 262144 | 15 524288 | 0 1048576 | 1 2097152 | 0 • No huge malloc()s, but RSS continues to climb.
  • 41. Memory: malloc() profiling # dtrace -n 'pid$target::malloc:entry { @ = quantize(arg0); }' -p 17472 dtrace: description 'pid$target::malloc:entry ' matched 3 probes ^C value ------------- Distribution ------------- count 2 | 0 4 | 3 8 |@ 5927 16 |@@@@ 41818 32 |@@@@@@@@@ 81991 64 |@@@@@@@@@@@@@@@@@@ 169888 128 |@@@@@@@ 69891 256 | 2257 512 | 406 1024 | 893 2048 | 146 4096 | 1467 8192 | 755 16384 | 950 32768 | 83 65536 | 31 131072 | 11 262144 | 15 524288 | 0 1048576 | 1 2097152 | 0 This tool (one-liner) profiles malloc() request sizes • No huge malloc()s, but RSS continues to climb.
  • 42. Memory: heap growth • Tracing why the heap grows via brk(): # dtrace -n 'syscall::brk:entry /execname == "beam.smp"/ { ustack(); }' dtrace: description 'syscall::brk:entry ' matched 1 probe CPU ID FUNCTION:NAME 10 18 brk:entry libc.so.1`_brk_unlocked+0xa libumem.so.1`vmem_sbrk_alloc+0x84 libumem.so.1`vmem_xalloc+0x669 libumem.so.1`vmem_alloc+0x14f libumem.so.1`vmem_xalloc+0x669 libumem.so.1`vmem_alloc+0x14f libumem.so.1`umem_alloc+0x72 libumem.so.1`malloc+0x59 libstdc++.so.6.0.14`_Znwm+0x20 libstdc++.so.6.0.14`_Znam+0x9 eleveldb.so`_ZN7leveldb9ReadBlockEPNS_16RandomAccessFileERKNS_11Rea... eleveldb.so`_ZN7leveldb5Table11BlockReaderEPvRKNS_11ReadOptionsERKN... eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator13InitDataBl... eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator4SeekERKNS_5... eleveldb.so`_ZN7leveldb12_GLOBAL__N_116TwoLevelIterator4SeekERKNS_5... eleveldb.so`_ZN7leveldb12_GLOBAL__N_115MergingIterator4SeekERKNS_5S... eleveldb.so`_ZN7leveldb12_GLOBAL__N_16DBIter4SeekERKNS_5SliceE+0xcc eleveldb.so`eleveldb_get+0xd3 beam.smp`process_main+0x6939 beam.smp`sched_thread_func+0x1cf beam.smp`thr_wrapper+0xbe This shows the user-land stack trace for every heap growth
  • 43. Memory: heap growth • More DTrace showed the size of the malloc()s causing the brk()s: # dtrace -x dynvarsize=4m -n ' pid$target::malloc:entry { self->size = arg0; } syscall::brk:entry /self->size/ { printf("%d bytes", self->size); } pid$target::malloc:return { self->size = 0; }' -p 17472 dtrace: CPU 0 0 description 'pid$target::malloc:entry ' matched 7 probes ID FUNCTION:NAME 44 brk:entry 8343520 bytes 44 brk:entry 8343520 bytes [...] • These 8 Mbyte malloc()s grew the heap • Even though the heap has Gbytes not in use • This is starting to look like an OS issue
  • 44. Memory: allocator internals • More tools were created: • Show memory entropy (+ malloc - free) along with heap growth, over time • Show codepath taken for allocations compare successful with unsuccessful (heap growth) • Show allocator internals: sizes, options, flags • And run in the production environment • Briefly; tracing frequent allocs does cost overhead • Casting light into what was a black box
  • 45. Memory: allocator internals 4 <- vmem_xalloc 0 4 -> _sbrk_grow_aligned 4096 4 <- _sbrk_grow_aligned 17155911680 4 -> vmem_xalloc 7356400 4 | vmem_xalloc:entry umem_oversize 4 -> vmem_alloc 7356416 4 -> vmem_xalloc 7356416 4 | vmem_xalloc:entry sbrk_heap 4 -> vmem_sbrk_alloc 7356416 4 -> vmem_alloc 7356416 4 -> vmem_xalloc 7356416 4 | vmem_xalloc:entry sbrk_top 4 -> vmem_reap 16777216 4 <- vmem_reap 3178535181209758 4 | vmem_xalloc:return vmem_xalloc() == NULL, vm: sbrk_top, size: 7356416, align: 4096, phase: 0, nocross: 0, min: 0, max: 0, vmflag: 1 libumem.so.1`vmem_xalloc+0x80f libumem.so.1`vmem_sbrk_alloc+0x33 libumem.so.1`vmem_xalloc+0x669 libumem.so.1`vmem_alloc+0x14f libumem.so.1`vmem_xalloc+0x669 libumem.so.1`vmem_alloc+0x14f libumem.so.1`umem_alloc+0x72 libumem.so.1`malloc+0x59 libstdc++.so.6.0.3`_Znwm+0x2b libstdc++.so.6.0.3`_ZNSs4_Rep9_S_createEmmRKSaIcE+0x7e
  • 46. Memory: solution • These new tools and metrics pointed to the allocation algorithm “instant fit” • Someone had suggested this earlier; the tools provided solid evidence that this really was the case here • A new version of libumem was built to force use of VM_BESTFIT • and added by Robert Mustacchi as a tunable: UMEM_OPTIONS=allocator=best • Customer restarted Riak with new libumem version • Problem solved
  • 47. Memory: on the Cloud • With OS virtualization, you can have: Paging without scanning • paging == swapping blocks with physical storage • swapping == swapping entire threads between main memory and physical storage • Resource control paging is unrelated to the page scanner, so, no vmstat scan rate (sr) despite anonymous paging • More new tools: DTrace sysinfo:::anonpgin by process name, zonename
  • 48. Memory: summary • Superficial data available, detailed info not • not by default • Many new tools were created • not easy, but made possible with DTrace
  • 49. Disk
  • 50. Disk: problem • Application performance issues • Disks look busy (iostat) • Blame the disks? $ iostat -xnz 1 [...] r/s 124.0 w/s 334.9 r/s 114.0 w/s 407.1 r/s 85.0 w/s 438.0 extended device statistics kr/s kw/s wait actv wsvc_t asvc_t %w %b device 15677.2 40484.9 0.0 1.0 0.0 2.2 1 69 c0t1d0 extended device statistics kr/s kw/s wait actv wsvc_t asvc_t %w %b device 14595.9 49409.1 0.0 0.8 0.0 1.5 1 56 c0t1d0 extended device statistics kr/s kw/s wait actv wsvc_t asvc_t %w %b device 10814.8 53242.1 0.0 0.8 0.0 1.6 1 57 c0t1d0 • Many graphical tools are built upon iostat
  • 51. Disk: on the Cloud • Tenants can’t see each other • Maybe a neighbor is doing a backup? • Maybe a neighbor is running a benchmark? • Can’t see their processes (top/prstat) • Blame what you can’t see
  • 52. Disk: VFS • Applications usually talk to a file system • and are hurt by file system latency • Disk I/O can be: • unrelated to the application: asynchronous tasks • inflated from what the application requested • deflated “ “ • blind to issues caused higher up the kernel stack
  • 53. Disk: issues with iostat(1) • Unrelated: • other applications / tenants • file system prefetch • file system dirty data fushing • Inflated: • rounded up to the next file system record size • extra metadata for on-disk format • read-modify-write of RAID5
  • 54. Disk: issues with iostat(1) • Deflated: • read caching • write buffering • Blind: • lock contention in the file system • CPU usage by the file system • file system software bugs • file system queue latency
  • 55. Disk: issues with iostat(1) • blind (continued): • disk cache flush latency (if your file system does it) • file system I/O throttling latency • I/O throttling is a new ZFS feature for cloud environments • adds artificial latency to file system I/O to throttle it • added by Bill Pijewski and Jerry Jelenik of Joyent
  • 56. Disk: file system latency • Using DTrace to summarize ZFS read latency: $ dtrace -n 'fbt::zfs_read:entry { self->start = timestamp; } fbt::zfs_read:return /self->start/ { @["ns"] = quantize(timestamp - self->start); self->start = 0; }' dtrace: description 'fbt::zfs_read:entry ' matched 2 probes ^C ns value 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 ------------- Distribution ------------- count | 0 |@ 6 |@@ 18 |@@@@@@@ 79 |@@@@@@@@@@@@@@@@@ 191 |@@@@@@@@@@ 112 |@ 14 | 1 | 1 | 0 | 0 | 0 | 0 |@@@ 31 |@ 9 | 0
  • 57. Disk: file system latency • Using DTrace to summarize ZFS read latency: $ dtrace -n 'fbt::zfs_read:entry { self->start = timestamp; } fbt::zfs_read:return /self->start/ { @["ns"] = quantize(timestamp - self->start); self->start = 0; }' dtrace: description 'fbt::zfs_read:entry ' matched 2 probes ^C ns value 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288 1048576 2097152 4194304 8388608 16777216 ------------- Distribution ------------- count | 0 |@ 6 |@@ 18 |@@@@@@@ 79 |@@@@@@@@@@@@@@@@@ 191 |@@@@@@@@@@ 112 |@ 14 | 1 | 1 | 0 | 0 | 0 | 0 |@@@ 31 |@ 9 | 0 Cache reads Disk reads
  • 58. Disk: file system latency • Tracing zfs events using zfsslower.d: # ./zfsslower.d 10 TIME 2011 May 17 01:23:12 2011 May 17 01:23:13 2011 May 17 01:23:33 2011 May 17 01:23:33 2011 May 17 01:23:51 ^C PROCESS mysqld mysqld mysqld mysqld httpd D R W W W R KB 16 16 16 16 56 ms 19 10 11 10 14 FILE /z01/opt/mysql5-64/data/xxxxx/xxxxx.ibd /z01/var/mysql/xxxxx/xxxxx.ibd /z01/var/mysql/xxxxx/xxxxx.ibd /z01/var/mysql/xxxxx/xxxxx.ibd /z01/home/xxxxx/xxxxx/xxxxx/xxxxx/xxxxx • Argument is the minimum latency in milliseconds
  • 59. Disk: file system latency • Can trace this from other locations too: • VFS layer: filter on desired file system types • syscall layer: filter on file descriptors for file systems • application layer: trace file I/O calls
  • 60. Disk: file system latency • And using SystemTap: # ./vfsrlat.stp Tracing... Hit Ctrl-C to end ^C [..] ext4 (ns): value |-------------------------------------------------- count 256 | 0 512 | 0 1024 | 16 2048 | 17 4096 | 4 8192 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16321 16384 | 50 32768 | 1 65536 | 13 131072 | 0 262144 | 0 • Traces vfs.read to vfs.read.return, and gets the FS type via: $file->f_path->dentry->d_inode->i_sb->s_type->name • Warning: this script has crashed ubuntu/CentOS; I’m told RHEL is better
  • 61. Disk: file system visualizations • File system latency as a heat map (Cloud Analytics): • This screenshot shows severe outliers
  • 62. Disk: file system visualizations • Sometimes the heat map is very surprising: • This screenshot is from the Oracle ZFS Storage Appliance
  • 63. Disk: summary • Misleading data available • New tools/metrics created • Latency visualizations
  • 65. Network: problem • TCP SYNs queue in-kernel until they are accept()ed • The queue length is the TCP listen backlog • may be set in listen() • and limited by a system tunable (usually 128) • on SmartOS: tcp_conn_req_max_q • What if the queue remains full • eg, application is overwhelmed with other work, • or CPU starved • ... and another SYN arrives?
  • 66. Network: TCP listen drops • Packet is dropped by the kernel • fortunately a counter is bumped: $ netstat -s | grep Drop tcpTimRetransDrop = 56 tcpTimKeepaliveProbe= 1594 tcpListenDrop =3089298 tcpHalfOpenDrop = 0 icmpOutDrops = 0 sctpTimRetrans = 0 sctpTimHearBeatProbe= 0 sctpListenDrop = 0 tcpTimKeepalive tcpTimKeepaliveDrop tcpListenDropQ0 tcpOutSackRetrans icmpOutErrors sctpTimRetransDrop sctpTimHearBeatDrop sctpInClosed = 2582 = 41 = 0 =1400832 = 0 = 0 = 0 = 0 • Remote host waits, and then retransmits • TCP retransmit interval; usually 1 or 3 seconds
  • 67. Network: predicting drops • How do we know if we are close to dropping? • An early warning
  • 68. Network: tcpconnreqmaxq.d • DTrace script traces drop events, if they occur: # ./tcpconnreqmaxq.d Tracing... Hit Ctrl-C to end. 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop 2012 Jan 19 01:37:52 tcp_input_listener:tcpListenDrop [...] • ... and when Ctrl-C is hit: cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504 cpid:11504
  • 69. Network: tcpconnreqmaxq.d tcp_conn_req_cnt_q distributions: cpid:3063 value -1 0 1 max_q:8 ------------- Distribution ------------- count | 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1 | 0 cpid:11504 value -1 0 1 2 4 8 16 32 64 128 256 max_q:128 ------------- Distribution ------------- count | 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 7279 |@@ 405 |@ 255 |@ 138 | 81 | 83 | 62 | 67 | 34 | 0 tcpListenDrops: cpid:11504 max_q:128 34
  • 70. Network: tcpconnreqmaxq.d tcp_conn_req_cnt_q distributions: cpid:3063 value -1 0 1 max_q:8 ------------- Distribution ------------- count | 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1 | 0 cpid:11504 value -1 0 1 2 4 8 16 32 64 128 256 max_q:128 ------------- Distribution ------------- count | 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 7279 |@@ 405 |@ 255 |@ 138 Length of queue | 81 | 83 measured | 62 on SYN event | 67 | 34 | 0 tcpListenDrops: cpid:11504 max_q:128 value in use 34
  • 71. Network: tcplistendrop.d • More details can be fetched as needed: # ./tcplistendrop.d TIME 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 2012 Jan 19 01:22:49 [...] SRC-IP 10.17.210.103 10.17.210.108 10.17.210.116 10.17.210.117 10.17.210.112 10.17.210.106 10.12.143.16 10.17.210.100 10.17.210.99 10.17.210.98 10.17.210.101 PORT 25691 18423 38883 10739 27988 28824 65070 56392 24628 11686 34629 -> -> -> -> -> -> -> -> -> -> -> DST-IP 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 192.192.240.212 PORT 80 80 80 80 80 80 80 80 80 80 80 • Just tracing the drop code-path • Don’t need to pay the overhead of sniffing all packets
  • 72. Network: DTrace code • Key code from tcplistendrop.d: fbt::tcp_input_listener:entry { self->mp = args[1]; } fbt::tcp_input_listener:return { self->mp = 0; } mib:::tcpListenDrop /self->mp/ { this->iph = (ipha_t *)self->mp->b_rptr; this->tcph = (tcph_t *)(self->mp->b_rptr + 20); printf("%-20Y %-18s %-5d -> %-18s %-5dn", walltimestamp, inet_ntoa(&this->iph->ipha_src), ntohs(*(uint16_t *)this->tcph->th_lport), inet_ntoa(&this->iph->ipha_dst), ntohs(*(uint16_t *)this->tcph->th_fport)); } • This uses the unstable interface fbt provider • a stable tcp provider now exists, which is better for more common tasks - like connections by IP
  • 73. Network: summary • For TCP, while many counters are available, they are system wide integers • Custom tools can show more details • addresses and ports • kernel state • needs kernel access and dynamic tracing
  • 74. Data Recap • Problem types • CPU utilization scaleability • CPU usage scaleability • CPU latency observability • Memory observability • Disk observability • Network observability
  • 75. Data Recap • Problem types, solution types • CPU utilization scaleability • CPU usage scaleability • CPU latency observability • Memory observability • Disk observability • Network observability visualizations metrics
  • 77. Performance Analysis • Goals • Capacity planning • Issue analysis
  • 78. Performance Issues • Strategy • Step 1: is there a problem? • Step 2: which subsystem/team is responsible? • Difficult to get past these steps without reliable metrics
  • 79. Problem Space • Myths • Vendors provide good metrics with good coverage • The problem is to line-graph them • Realities • Metrics can be wrong, incomplete and misleading, requiring time and expertise to interpret • Line graphs can hide issues
  • 80. Problem Space • Cloud computing confuses matters further: • hiding metrics from neighbors • throttling performance due to invisible neighbors
  • 81. Example Problems • Included: • Understanding utilization across 5,312 CPUs • Using disk I/O metrics to explain application performance • A lack of metrics for memory growth, packet drops, ...
  • 82. Example Solutions: tools • Device utilization heat maps for CPUs • Flame graphs for CPU profiling • CPU dispatcher queue latency by zone • CPU caps latency by zone • malloc() size profiling • Heap growth stack backtraces • File system latency distributions • File system latency tracing • TCP accept queue length distribution • TCP listen drop tracing with details
  • 83. Key Concepts • Visualizations • heat maps for device utilization and latency • flame graphs • Custom metrics often necessary • Latency-based for issue analysis • If coding isn’t practical/timely, use dynamic tracing • Cloud Computing • Provide observability (often to show what the problem isn’t) • Develop new metrics for resource control effects
  • 84. DTrace • Many problems were only solved thanks to DTrace • In the SmartOS cloud environment: • The compute node (global zone) can DTrace everything (except for KVM guests, for which it has a limited view: resource I/O + some MMU events, so far) • SmartMachines (zones) have the DTrace syscall, profile (their user-land only), pid and USDT providers • Joyent Cloud Analytics uses DTrace from the global zone to give extended details to customers
  • 85. Performance • The more you know, the more you don’t • Hopefully I’ve turned some unknown-unknowns into known-unknowns
  • 86. Thank you • Resources: • http://dtrace.org/blogs/brendan • More CPU utilization visualizations: http://dtrace.org/blogs/brendan/2011/12/18/visualizing-device-utilization/ • Flame Graphs: http://dtrace.org/blogs/brendan/2011/12/16/flame-graphs/ and http://github.com/brendangregg/FlameGraph • More iostat(1) & file system latency discussion: http://dtrace.org/blogs/brendan/tag/filesystem-2/ • Cloud Analytics: • OSCON slides: http://dtrace.org/blogs/dap/files/2011/07/ca-oscon-data.pdf • Joyent: http://joyent.com • brendan@joyent.com