SlideShare a Scribd company logo
JVM Profiling
Under da Hood
Richard Warburton - @RichardWarburto
Nitsan Wakart - @nitsanw
Why Profile?
Lies, Damn Lies and Statistical Profiling
Under the Hood
Conclusion
Jvm profiling under the hood
Jvm profiling under the hood
Measure data from your application
Exploratory Profiling
Execution Profiling
=
Where in code is my application
spending time?
CPU Profiling Limitations
● Finds CPU bound bottlenecks
● Many problems not CPU Bound
○ Networking
○ Database or External Service
○ I/O
○ Garbage Collection
○ Insufficient Parallelism
○ Blocking & Queuing Effects
Why Profile?
Lies, Damn Lies and Statistical Profiling
Under the Hood
Conclusion
Jvm profiling under the hood
Different Execution Profilers
● Instrumenting
○ Adds timing code to application
● Sampling
○ Collects thread dumps periodically
Sampling Profilers
WebServerThread.run()
Controller.doSomething() Controller.next()
Repo.readPerson()
new Person()
View.printHtml()
Periodicity Bias
● Bias from sampling at a fixed interval
● Periodic operations with the same frequency
as the samples
● Timed operations
Periodicity Bias
a() ??? a() ??? a() ??? a() ???
Stack Trace Sampling
● JVMTI interface: GetCallTrace
○ Trigger a global safepoint(not on Zing)
○ Collect stack trace
● Large impact on application
● Samples only at safepoints
Example
private static void outer()
{
for (int i = 0; i < OUTER; i++)
{
hotMethod(i);
}
}
// https://github.com/RichardWarburton/profiling-samples
Example (2)
private static void hotMethod(final int i)
{
for (int k = 0; k < N; k++)
{
final int[] array = SafePointBias. array;
final int index = i % SIZE;
for (int j = index; j < SIZE; j++)
{
array[index] += array[j];
}
}
}
Jvm profiling under the hood
-XX:+PrintSafepointStatistics
ThreadDump 48
Maximum sync time 985 ms
Whats a safepoint?
● Java threads poll global flag
○ At ‘uncounted’ loops back edge
○ At method exit/enter
● A safepoint poll can be delayed by:
○ Large methods
○ Long running ‘counted’ loops
○ BONUS: Page faults/thread suspension
Jvm profiling under the hood
Safepoint Bias
WebServerThread.run()
Controller.doSomething() Controller.next()
Repo.readPerson()
new Person()
View.printHtml() ???
Jvm profiling under the hood
Let sleeping dogs lie?
● ‘GetCallTrace’ profilers will sample ALL
threads
● Even sleeping threads...
This Application Mostly Sleeps
JVisualVM snapshot
No CPU? No profile!
JMC profile
Why Profile?
Lies, Damn Lies and Statistical Profiling
Under the Hood
Conclusion
Honest Profiler
https://github.com/richardwarburton/honest-profiler
Jvm profiling under the hood
AsyncGetCallTrace
● Used by Oracle Solaris Studio
● Adapted to open source prototype by
Google’s Jeremy Manson
● Unsupported, Undocumented …
Underestimated
SIGPROF - Interrupt Handlers
● OS Managed timing based interrupt
● Interrupts the thread and directly calls an
event handler
● Used by profilers we’ll be talking about
Design
Log File
Processor
Thread Graphical UI
Console UI
Signal
Handler
Signal
Handler
Os Timer Thread
“You are in a maze of twisty little stack frames,
all alike”
AsyncGetCallTrace under the hood
● A Java thread is ‘possessed’
● You have the PC/FP/SP
● What is the call trace?
○ jmethodId - Java Method Identifier
○ bci - Byte Code Index -> used to find line number
Where Am I?
● Given a PC what is the current method?
● Is this a Java method?
○ Each method ‘lives’ in a range of addresses
● If not, what do we do?
Java Method? Which line?
● Given a PC, what is the current line?
○ Not all instructions map directly to a source line
● Given super-scalar CPUs what does PC
mean?
● What are the limits of PC accuracy?
“> I think Andi mentioned this to me last year --
> that instruction profiling was no longer reliable.
It never was.”
http://permalink.gmane.org/gmane.linux.kernel.perf.user/1948
Exchange between Brenden Gregg and Andi Kleen
Skid
● PC indicated will be >= to PC at sample time
● Known limitation of instruction profiling
● Leads to harder ‘blame analysis’
Limits of line number accuracy:
Line number (derived from BCI) is the closest
attributable BCI to the PC (-XX:+DebugNonSafepoint)
The PC itself is within some skid distance from
actual sampled instruction
● Divided into frames
○ frame { sender*, stack*, pc }
● A single linked list:
root(null, s0, pc1) <- call1 (root, s1, pc2) <- call2(call1, s2, pc2)
● Convert to: (jmethodId,lineno)
The Stack
A typical stack
● JVM Thread runner infra:
○ JavaThread::run to JavaCalls::call_helper
● Interleaved Java frames:
○ Interpreted
○ Compiled
○ Java to Native and back
● Top frame may be Java or Native
Native frames
● Ignored, but need to navigate through
● Use a dedicated FP register to find sender
● But only if compiled to do so…
● Use a last remembered Java frame instead
See: http://duartes.org/gustavo/blog/post/journey-to-the-stack/
Java Compiled Frames
● C1/C2 produce native code
● No FP register: use set frame size
● Challenge: methods can move (GC)
● Challenge: methods can get recompiled
Java Interpreter frames
● Separately managed by the runtime
● Make an effort to look like normal frames
● Challenge: may be interrupted half-way
through construction...
Virtual Frames
● C1/C2 inline code (intrinsics/other methods)
● No data on stack
● Must use JVM debug info
AsyncGetCallTrace Limitations
● Only profiles running threads
● Accuracy of line info limited by reality
● Only reports Java frames/threads
● Must lookup debug info during call
Compilers: Friend or Fiend?
void safe_reset(void *start, size_t size) {
char *base = reinterpret_cast<char *>(start);
char *end = base + size;
for (char *p = base; p < end; p++) {
*p = 0;
}
}
Compilers: Friend or Fiend?
safe_reset(void*, unsigned long):
lea rdx, [rdi+rsi]
cmp rdi, rdx
jae .L3
sub rdx, rdi
xor esi, esi
jmp memset
.L3:
rep ret
Concurrency Bug
● Even simple concurrency bugs are hard to
spot
● Unspotted race condition in the ring buffer
● Spotted thanks to open source & Rajiv
Signal
Writer
Reader
Writer
Reader
Extra Credit!
Native Profiling Tools
● Profile native methods
● Profile at the instruction level
● Profile hardware counters
Perf
● A Linux profiling tool
● Can be made to work with Java
● JMH integration
● Ongoing integration efforts
Solaris Studio
● Works on Linux!
● Secret Weapon!
● Give it a go!
ZVision
● Works for Zing
● No HWC support
● Very informative
Why Profile?
Lies, Damn Lies and Statistical Profiling
Under the Hood
Conclusion
What did we cover?
● Biases in Profilers
● More accurate sampling
● Alternative Profiling Approaches
Don’t just blindly trust your tooling.
Test your measuring instruments
Open Source enables implementation review
Q & A
@nitsanw
psy-lob-saw.blogspot.co.uk
@richardwarburto
insightfullogic.com
java8training.com
www.pluralsight.
com/author/richard-
warburton
Slides after here just for reference,
don’t delete or show
Jvm profiling under the hood

More Related Content

Jvm profiling under the hood