SlideShare a Scribd company logo
Garbage Collectors
Haim Yadid,
Head of Performance and Application Infra Group
Motivation & Goals
Why Java has GC?
Memory management is hard
malloc/free
Memory leaks (dangling pointers)
Heap corruption (Free an object twice)
Reference counting
Cyclic graphs
Garbage Collectors
GC
Me allocate
myObject = new MyClass(2015)
GC cleans
So everything is cool !
Goal: Minimize Memory Overhead
Garbage collector memory overhead
Internal structures
Additional memory required for GC
Policy and amount of memory generated before GC
Garbage Collectors
Goal: Application Throughput
No garbage collection at all -->

Application throughput 1 (100% of the time)
If in average garbage collection
consumes x milliseconds
Every y milliseconds
Throughput is (y-x)/y
E.g. if GC consumes 50 ms every second 

Throughput is 95% (GC overhead is 5%)
Garbage Collectors
GCtime	
  
—————	
  
Total	
  Time	
  
Goal: Responsiveness
Pause Time
Garbage collector stops the application
During that time the application is not responsive
What is the maximal delay your application can
sustain:
Batch applications seconds…
Web applications ½ second?
Swing UI 100 ms max
Trading application milliseconds
Robotics microseconds….
Garbage Collectors
Goals
Memory footprint
Throughput
Max pause time
Garbage Collectors: G1
Contradicting!
Choose 2 of 3
Goal: Fast Allocation
Maintaining free list of objects tend to lead to
fragmentation
Fragmentation increases allocation time
A well known problem of C/C++ programs
Garbage Collectors
Goal: Locality
TLAB - thread local allocation buffer
Maintaining locality utilizes CPU cache
Linear allocation mechanism
Per thread buffer
allocated on first object alloc after new Gen GC.
Small objects allocated linearly on that buffer
FAST 10 machine instructions
Resized automatically ( ResizeTLAB=true)
Based on TLABWasteTargetPercent=1
Garbage Collectors
Terminology
Sizing Heap
Heap
the region of objects reside
|H| - The size of the heap
#H - number of object in the heap
Garbage Collectors
GC Root
An object which is references from outside the heap
or heap region.
Java Local
JNI
Native stack
System class
Busy monitor
Etc…
Garbage Collectors
Sizing : Live Set
Live Set :
Object that are reachable from garbage collection roots
#LS - number of object in the live set
|LS| - The size of live set
Garbage Collectors
Mutator
Application threads
The stuff that create new objects and changes state
The stuff that makes the garbage collector suffer
Allocation rate - how fast new objects are allocated
Mutation rate - how fast app changes references of
live objects
Garbage Collectors
Collector
Pauseless Collector (Concurrent Collector )

vs
Stop the world (STW) Collector
Serial Collector -Single threaded

vs
Parallel collector -Multi threaded
Incremental vs Monolithic
Conservative vs Precise
Garbage Collectors
GC Safepoint
A point in thread execution where it is safe for a GC
to operate
References inside thread stacks are identifiable
If a thread is in a safe point GC may commence
Thread will not be able to exit a safe-point until GC
ends
Global Safe-point : All threads enter a safe point
Safe points must be frequent
Garbage Collectors
Building Blocks: Mark and Sweep
Mark: O(#LS)
Every object holds a bit initially set to 0
Start with GC roots
DFS traversal every visited object change to 1
Sweep O(|H|)
Objects with 0 are added to the empty list
Downside: Heap fragmentation
Garbage Collectors
A
B
C
E
F
D A B C D F
Building Blocks: Copy Collector
Mark: O(#LS)
Same as mark and sweep
Copy: O(|LS|) = O(#LS)
All live objects are copied to an empty region(to space)
No fragmentation
Downside:
Need twice as much memory
Copy is very expensive
STW - mutators cannot work at the same time as copy
Garbage Collectors
A B C D F A B C D F
from space to space
Building Blocks: Mark (Sweep)Compact
Mark: O(#LS)
Same as mark&sweep
Sliding compaction O(|H|+|LS|)
move objects (relocate)
fix pointers(remap)
Compact to the beginning of the heap
Do not need twice the memory
Copy is more delicate and may be slower
Garbage Collectors
The Weak Generational Hypothesis
Most objects survive for only a short period of time.
Low number of references from old to new objects
Garbage Collectors
Generational Garbage Collectors
Since JDK 1.2 all collectors are generational and
advantage of the WGH
Different Collectors can be chosen for each
generation
New generation collector
Tenured generation collector
GC roots to young gen are maintained by a “remembered
set”
Garbage Collectors
Oracle Hotspot Garbage Collectors
Serial (Serial, MSC)
Train Collector (history)
Parallel Collectors (a.k.a throughput collector)
Concurrent collector (CMS)
iCMS incremental CMS
G1GC (Experimental)
Garbage Collectors
Major Memory Regions
Monitoring the JVM
Young PermTenured
Code
Cache
Young generation
Further divided into:
Eden
A “from” survivor space
A “to” survivor space
Tenured (old) generation
Permanent generation/Meta Space
Code Cache
Heap Non Heap
NativeNative
Object Life Cycle
Most objects are allocated in Eden space.
When Eden fills up a minor GC occurs
Reachable object are copied “to” survivor space.
There are two survivor spaces
surviving objects are copied alternately 

from one to the other.
Eden S0
S1
Eden S0
S1
Eden S0
S1
1 2 3
Object Promotion
Objects are promoted to the old generation
(tenured) when:
surviving several minor GCs
Survival spaces fill up
Serial Collectors
Single threaded
Stop the world
Monolithic
Efficient (no communication between threads)
The default on certain platforms

Garbage Collectors:Serial
Serial New Collector
-XX:+ UseSerialGC
Serial new Single threaded young generation
collector
Triggered when:
Eden space is full.
An explicit invocation or call to System.gc().
Garbage Collectors: Serial
MSC (Serial Old)
Single threaded tenured generation collector
Mark & Sweep Compact
Events which initiate a serial collector garbage
collection
Tenured generation space is unable to satisfy an 

object promotion coming from young generation.
An explicit invocation or call to System.gc().
Garbage Collectors: Serial
Serial Collector: Suitability
Well suited for single processor core machines
CPU affinity (one-to-one JVM to processor core
configuration)
Tends to work well for applications with small Java
heaps, i.e. less than 128mb- 256mb
Garbage Collectors:Serial
Train Collector
Introduced in Java 1.3
Divides the heap into small chunks
Incremental
Experimental
Discontinued on Java 1.4
Garbage Collectors
Parallel Collector
Multi-threaded
Monolithic
Stop the world
Three variants
-XX:+UseParallelGC
-XX:+UseParallelOldGC
The default on most platforms
Garbage Collectors
Managing collector threads
Number parallel collector threads controlled 

by -XX:ParallelGCThreads=<N>
Defaults to Runtime.availableProcessors(). 

In a JDK 6 update release, 5/8ths available processors if >
8
Multiple JVM per machine configurations, 

set -XX: ParallelGCThreads=<N> such that sum of all
threads < NCPU
Garbage Collectors
Parallel GC Triggering
Same as serial collector
Events which initiate a minor garbage collection
Eden space is unable to satisfy an object allocation
request. Results in a minor garbage collection event.
Events which might initiate a full garbage collection
Tenured generation space unable to satisfy an 

object promotion coming from young generation.
An explicit invocation or call to System.gc().
Garbage Collectors
Parallel Collector:Suitability
Reduce garbage collection overhead 

on multi-core processor systems
Reduce pause time on multi-core systems
Best throughput
Pause time may be reasonable when heap size <
1GB
Garbage Collectors
CMS Collector
(Mostly) Concurrent mark and sweep
tenured space collector
Runs mostly concurrently with application threads
Do not compact heap!
Enabled with -XX:+UseConcMarkSweepGC
ParNew - Parallel, multi-threaded young generation
collector enabled by default. working with CMS
Garbage Collectors
Alas!
Lower throughput
Requires more memory 20-30% more
Concurrent mode failure will fallback to a stop the
world full GC can occur when :
objects are copied to the tenured space faster than the
concurrent collector can collect them. (“loses the race”)
space fragmentation
-XX:PrintFLSStatistics=1
Garbage Collectors
Concurrent Collector Phases
Concurrent collector cycle contains the following
phases:
Initial mark (*)
Concurrent mark
Concurrent Pre-clean
Remark (*) - second pass
Concurrent sweep
Concurrent reset
Garbage Collectors
The Concurrent Collector
Initial mark phase(*)
Objects in the tenured generation are “marked” 

as reachable including those objects which may 

be reachable from young generation.
Pause time is typically short in duration relative 

to minor collection pause times.
Concurrent mark phase
Traverses the tenured generation object graph 

for reachable objects concurrently while Java 

application threads are executing.
Garbage Collectors
The Concurrent Collector Phases
Remark(*)
Finds objects that were missed by the concurrent 

mark phase due to updates by Java application 

threads to objects after the concurrent collector 

had finished tracing that object.
Concurrent sweep
Collects the objects identified as 

unreachable during marking phases.
Concurrent reset
Prepares for next concurrent collection.
Garbage Collectors
PermGen Collection
Classes will not be collected during CMS 

concurrent phases
Only during Full (STW) collection
Explicitly instructed to do so using 

-XX:+CMSClassUnloadingEnabled and 

-XX:+PermGenSweepingEnabled, 

(the 2nd switch is not needed in post HotSpot 6.0u4
JVMs).
Garbage Collectors
Suitability
Application responsiveness is more important than
application throughput
More than one core
Large Heaps > 1GB
Garbage Collectors
ExplicitGC and CMS
Explicit GC used to invoke the stop the world GC
This can cause a problem with large heaps

-XX:+ExplicitGCInvokesConcurrent (Java 6)
-XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses 

(requires Java6u4 or later).
Garbage Collectors
CMS Minor Collection Triggering
Minor collections are triggered as with the parallel
collector
The ParNew collector is built in such a way it can
work in parallel with the CMS
Garbage Collectors
CMS Collection Triggering
Start if the occupancy exceeds a percentage
threshold
Default value is 92%
-XX:CMSInitiatingOccupancyFraction=n 

where n is the % of the tenured space size.
Garbage Collectors
iCMS
Deprecated on Java 8
CMSIncrementalMode enables the concurrent 

modes to be done incrementally.
Periodically gives additional processor back to 

the application resulting in better application
responsiveness by doing the concurrent work 

in small chunks.
Garbage Collectors
Tune iCMS
CMSIncrementalMode has a duty cycle that
controls the amount of work the concurrent collector
is allowed to do before giving up the processor.
Duty cycle is the % of time between minor
collections 

the concurrent collector is allowed to run.
Duty cycle by default is automatically computed
using what's called automatic pacing.
Both duty cycle and pacing can be fine tuned.
Garbage Collectors
Enabling iCMS Java 6
On JDK 6, recommend using the following two
switches together:
-XX:+UseConcMarkSweepGC and
-XX:+CMSIncrementalMode
Or use:
-Xincgc
Garbage Collectors
Enabling iCMS Java5
On JDK 5 use all of the following switches together:
-XX:+UseConcmarkSweepGC
-XX:+CMSIncrementalMode
-XX:+CMSIncrementalPacing
-XX:CMSIncrementalDutyCycleMin=0
-XX:CMSIncrementalDutyCycle=10
JDK 5 settings mirror the default settings decided
upon for JDK 6.
JDK 5's -Xincgc!= CMSIncrementalMode, it enables
CMS
Garbage Collectors
iCMS Fine Tuning
If full collections are still occurring, then:
Increase the safety factor using 

-XX:CMSIncrementalSafetyFactor=n 

The default value is 10. Increasing safety factor 

adds conservatism when computing the duty cycle.
Increase the minimum duty cycle using

-XX:CMSIncrementalDutyCycleMin=n 

The default is 0 in JDK 6, 10 in JDK 5.
Disable automatic pacing and use a fixed duty cycle
using 

-XX:-CMSIncrementalPacing and 

-XX:CMSIncrementalDutyCycle=n 

The default duty cycle is 10 in JDK 6, 50 in JDK 5.
Garbage Collectors
G1 Garbage collector
Garbage Collectors: G1
Garbage First GC
Stop the world
Incremental
Beta stage in Java6 and Java 7
Supported since Java7u4
Garbage Collectors: G1
Garbage First GC
Apply New generation harvesting to the tenured Gen
Achieve soft real time goal 

consume no more than x ms of any y ms time slice
while maintaining high throughput for
programs with large heaps and
high allocation rates,
running on large multi-processor machines.
Garbage Collectors: G1
Heap Layout
divided into equal-sized heap regions, each a
contiguous range of virtual memory
Region size is 1-32MB
based on heap size target 2000 regions
A linked list of empty regions
Heap is divided to New generation regions and old
generation regions
Garbage Collectors: G1
Each region is either Marked
Eden
Survivor Space
Old Generation
Empty
Humongous
Heap Layout
Garbage Collectors: G1
E E S S
E
E E E
O O O O O
H H H
E E E
G1 New Gen GC
Live objects from young generation are moved to
Survivor space regions
Old gen regions
STW pause
Calculate new size of eden and new Survivor space
Garbage Collectors: G1
G1 Concurrent Mark
Triggered when entire heap reaches certain threshold
Mark regions
Calculate liveliness information for each region
Concurrent
Empty regions are reclaimed immediately
Garbage Collectors: G1
OldGen collection
Choose regions with low liveliness
Piggyback some during next young GC
Denoted GC Pause (mixed)
Garbage Collectors: G1
Humongous Objects
1/2 of the heap region size
allocated in dedicated (contiguous sequences of)
heap regions;
these regions contain only the humongous object
GC is not optimized for these objects
Garbage Collectors: G1
Command line Options
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200
-XX:InitiatingHeapOccupancyPercent=45
Garbage Collectors: G1
Valid Combinations
Tenured
Garbage Collectors: G1
Young
G1GC
Parallel
Scavenge
ParNewSerial
Serial
Old
CMS
Parallel
Old
Summary: -XX flags for JDK6
Garbage Collectors: G1
Collector -XX: Param
"Serial" + "Serial Old“ UseSerialGC
"ParNew" + "Serial Old“ UseParNewGC
"ParNew"+"CMS" + "Serial Old“ * UseConcMarkSweepGC
"Parallel Scavenge" + "Serial Old" UseParallelGC
"Parallel Scavenge" + "Parallel Old" UseParallelOldGC
G1 Garbage collector ** UseG1GC
Alternatives
Garbage Collectors: G1
Azul C4
A proprietary JVM
Commercial
Pause-less
Requires changes in the OS (Linux only)
Achieves Throughput and Pause time
May require more memory…. 32GB and above
Useful for large heaps (1TB is casual)
Garbage Collectors: C4
Shenandoah
JEP 189 for Open JDK
Developed in RedHat
An Ultra-Low-Pause-Time Garbage
Regions same as G1
Concurrent collection w/o memory barrier
Not a generational collector
Brooks forwarding pointer
Garbage Collectors: Shenandoah
Ref
Object
Weird References
Garbage Collectors: G1
Object Life Cycle
Garbage Collectors: References
Created
Initialized
Strongly
Reachable
Softly
Reachable
Weakly
Reachable
Finalized
Phantom
Reachable
Finalizers
Create performance issues
For example: 

do not rely on a finalizer to close file descriptors.
Try to limit use of finalizer as safety net
Use other mechanisms for releasing resources.
Keep the work being done as short as possible.
Garbage Collectors: References
Finalizers and GC
Inside a finalizer you have a reference to your object
and technically you may resurrect it.
Objects which have a finalize method will need two
cycles of GC in order to be collected
Can lead to OOM errors
A resurrected object may be reachable again but it
finalize method will not run again.
In order to prevent this problem use phantom
references….
Garbage Collectors: References
Finalizers and Thread Safety
Finalizers are executed from a special thread
According to the Java memory model

updates to local variable may not be visible to the
Finalization thread
Occurs when GC happen too soon
In order to ensure correct memory visibility one need
to use a sync block to force coherency
Garbage Collectors: References
Reference Objects drawbacks
lots of reference objects also give the garbage
collector more work to do since unreachable
reference objects need to be discovered and 

queued during garbage collection.
Reference object processing can extend the time 

it takes to perform garbage collections, especially 

if there are consistently many unreachable reference
objects to process.
Garbage Collectors: References
GC Tuning
Tuning Memory: Tuning GC
Sizing Generations
VisualVM's Monitor tab
Jconsole's Memory tab
VisualGC's heap space sizes
GCHisto
jstat's GC options
-verbose:gc heap space sizes
-XX:+PrintGCDetails heap space sizes
Tuning Memory: Tuning GC
Logging GC activity
The three most important parameters are:
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-Xloggc:<logfile>
If you want less data
-verbose:gc
Tuning Memory: Tuning GC
Log File Rotation
-XX:+UseGCLogFileRotation
Control log file size
-XX:GCLogFileSize=1000
Number of log files
-XX:NumberOfGCLogFiles=10
Starting from Java 7
Tuning Memory: Tuning GC
Verbose GC on runtime
Java.lang.Memory
Set verbose attribute to true
Tuning Memory: Tuning GC
The GC Log
Log every GC occurs
PrintGCDetails (more verbose)
[GC [DefNew: 960K->64K(960K), 0.0047410 secs] 

3950K->3478K(5056K), 0.0047900 secs]
Verbose:gc (less verbose)
[GC 327680K->53714K(778240K), 0.2161340 secs]
Before->After(Total), time
Tuning Memory: Tuning GC
CMS Collector
Minor collections follow serial collector format.
[Full GC [CMS: 5994K->5992K(49152K),

0.2584730 secs] 6834K->5992K(63936K), 

[CMS Perm: 10971K->10971K(18404K)], 0.2586030 secs]
[GC [1 CMS-initial-mark: 13991K(20288K)] 

14103K(22400K), 0.0023781 secs]
[CMS-concurrent-preclean: 0.044/0.064 secs]
[GC [1 CMS-remark: 16090K(20288K)] 

17242K(22400K), 0.0210460 secs]
Tuning Memory: Tuning GC
Additional information
More information with the flags
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintTenuringDistribution
Tuning Memory: Tuning GC
GC Log Quirks
CMS GC logs may get garbled
Due to concurrency of the different GC mechanisms
Analysis tools should be able to overcome this
problem
Not easy at all and may need to fix manually.
Tuning Memory: Tuning GC
J9
-Xverbosegclog:<file path>
Structured xml
Tuning Memory: Tuning GC
<gc-end id="92" type="scavenge" contextid="88" durationms="4.464" usertimems="4.000" systemtimems="0.000"
timestamp="2014-01-21T16:37:18.953">
<mem-info id="93" free="5472664" total="8388608" percent="65">
<mem type="nursery" free="655360" total="2097152" percent="31">
<mem type="allocate" free="655360" total="1179648" percent="55" />
<mem type="survivor" free="0" total="917504" percent="0" />
</mem>
<mem type="tenure" free="4817304" total="6291456" percent="76">
<mem type="soa" free="4502936" total="5977088" percent="75" />
<mem type="loa" free="314368" total="314368" percent="100" />
</mem>
<pending-finalizers system="2" default="0" reference="24" classloader="0" />
<remembered-set count="1770" />
</mem-info>
</gc-end>
Visualization Tools
Tuning Memory: Tuning GC
JClarity: Cesnum
Commercial product
Give recommendations
Tuning Memory: Tuning GC
GCViewer
Open source originally by Tagtraum industry
A new fork exists in github
latest version 1.33
https://github.com/chewiebug/GCViewer
Supports G1GC Java6/7
Not bullet proof
Tuning Memory: Tuning GC
GCViewer
Tuning Memory: Tuning GC
GCViewer View
Choose which info to 

view
Tuning Memory: Tuning GC
GCViewer -Summary
Tuning Memory: Tuning GC
GCViewer - Memory
Tuning Memory: Tuning GC
GCViewer -Pause
Tuning Memory: Tuning GC
VisualGC
Tuning Memory: Tuning GC
Explicit GC
Do not use System.gc() unless there is a specific 

use case or need to.
Disable Explicit GC: -XX:+DisableExplicitGC.
Default RMI distributed GC interval is once per
minute, (60000 ms).
Use: 

-Dsun.rmi.dgc.client.gcInterval =3600000 

-Dsun.rmi.dgc.server.gcInterval =3600000
When using JDK 6 and the Concurrent collector also use 

-XX:+ExplicitGCInvokesConcurrent
Tuning Memory: Tuning GC
Interpretation
Tuning Memory: Tuning GC
The Jigsaw effect
Garbage collection makes heap diagrams over time
look like a jigsaw.
Minor collections visualize as small teeth
Full collections visualize as large teeth
Tuning Memory: Tuning GC
Measuring Memory Usage
Best way is heap dump
From GC logs look on the lower points of the Full
GC lines.
Tuning Memory: Tuning GC
60.0
90.0
120.0
150.0
180.0
Memory Leak
Draw the line between the full gc points
Increasing over time --> memory leak
Tuning Memory: Tuning GC
60.0
90.0
120.0
150.0
180.0
Low Throughput
Under 95%
Increase Heap:
Low throughput is usually a result of insufficient memory
GC kicks in too frequently and frees small amounts
New Gen too small
New gen GC kicks in too fast
Not able to release enough
as a result mid term objects spill to old gen.
Old Gen too small
Application state spills to new Gen.
Breaks the Generational Hypothesis
Tuning Memory: Tuning GC
Contradicting Goals
Goal1: Retain as many objects as possible in the
survivor spaces
Less promotion into the old generation
Less frequent old GCs
Goal2: Do not copy very long- lived objects between
the survivors
Unnecessary overhead on minor GCs
Tuning Memory: Tuning GC
High Pause Time
Choose The correct GC scheme:
CMS/G1 when low pause is required
Reduce footprint of user interactive processes
Reduce memory allocations do
not allocated too many related temporary objects
chunking of work.
Tuning Memory: Tuning GC

More Related Content

Let's talk about Garbage Collection

  • 1. Garbage Collectors Haim Yadid, Head of Performance and Application Infra Group
  • 3. Why Java has GC? Memory management is hard malloc/free Memory leaks (dangling pointers) Heap corruption (Free an object twice) Reference counting Cyclic graphs Garbage Collectors
  • 4. GC Me allocate myObject = new MyClass(2015) GC cleans So everything is cool !
  • 5. Goal: Minimize Memory Overhead Garbage collector memory overhead Internal structures Additional memory required for GC Policy and amount of memory generated before GC Garbage Collectors
  • 6. Goal: Application Throughput No garbage collection at all -->
 Application throughput 1 (100% of the time) If in average garbage collection consumes x milliseconds Every y milliseconds Throughput is (y-x)/y E.g. if GC consumes 50 ms every second 
 Throughput is 95% (GC overhead is 5%) Garbage Collectors GCtime   —————   Total  Time  
  • 7. Goal: Responsiveness Pause Time Garbage collector stops the application During that time the application is not responsive What is the maximal delay your application can sustain: Batch applications seconds… Web applications ½ second? Swing UI 100 ms max Trading application milliseconds Robotics microseconds…. Garbage Collectors
  • 8. Goals Memory footprint Throughput Max pause time Garbage Collectors: G1 Contradicting! Choose 2 of 3
  • 9. Goal: Fast Allocation Maintaining free list of objects tend to lead to fragmentation Fragmentation increases allocation time A well known problem of C/C++ programs Garbage Collectors
  • 10. Goal: Locality TLAB - thread local allocation buffer Maintaining locality utilizes CPU cache Linear allocation mechanism Per thread buffer allocated on first object alloc after new Gen GC. Small objects allocated linearly on that buffer FAST 10 machine instructions Resized automatically ( ResizeTLAB=true) Based on TLABWasteTargetPercent=1 Garbage Collectors
  • 12. Sizing Heap Heap the region of objects reside |H| - The size of the heap #H - number of object in the heap Garbage Collectors
  • 13. GC Root An object which is references from outside the heap or heap region. Java Local JNI Native stack System class Busy monitor Etc… Garbage Collectors
  • 14. Sizing : Live Set Live Set : Object that are reachable from garbage collection roots #LS - number of object in the live set |LS| - The size of live set Garbage Collectors
  • 15. Mutator Application threads The stuff that create new objects and changes state The stuff that makes the garbage collector suffer Allocation rate - how fast new objects are allocated Mutation rate - how fast app changes references of live objects Garbage Collectors
  • 16. Collector Pauseless Collector (Concurrent Collector )
 vs Stop the world (STW) Collector Serial Collector -Single threaded
 vs Parallel collector -Multi threaded Incremental vs Monolithic Conservative vs Precise Garbage Collectors
  • 17. GC Safepoint A point in thread execution where it is safe for a GC to operate References inside thread stacks are identifiable If a thread is in a safe point GC may commence Thread will not be able to exit a safe-point until GC ends Global Safe-point : All threads enter a safe point Safe points must be frequent Garbage Collectors
  • 18. Building Blocks: Mark and Sweep Mark: O(#LS) Every object holds a bit initially set to 0 Start with GC roots DFS traversal every visited object change to 1 Sweep O(|H|) Objects with 0 are added to the empty list Downside: Heap fragmentation Garbage Collectors A B C E F D A B C D F
  • 19. Building Blocks: Copy Collector Mark: O(#LS) Same as mark and sweep Copy: O(|LS|) = O(#LS) All live objects are copied to an empty region(to space) No fragmentation Downside: Need twice as much memory Copy is very expensive STW - mutators cannot work at the same time as copy Garbage Collectors A B C D F A B C D F from space to space
  • 20. Building Blocks: Mark (Sweep)Compact Mark: O(#LS) Same as mark&sweep Sliding compaction O(|H|+|LS|) move objects (relocate) fix pointers(remap) Compact to the beginning of the heap Do not need twice the memory Copy is more delicate and may be slower Garbage Collectors
  • 21. The Weak Generational Hypothesis Most objects survive for only a short period of time. Low number of references from old to new objects Garbage Collectors
  • 22. Generational Garbage Collectors Since JDK 1.2 all collectors are generational and advantage of the WGH Different Collectors can be chosen for each generation New generation collector Tenured generation collector GC roots to young gen are maintained by a ��remembered set” Garbage Collectors
  • 23. Oracle Hotspot Garbage Collectors Serial (Serial, MSC) Train Collector (history) Parallel Collectors (a.k.a throughput collector) Concurrent collector (CMS) iCMS incremental CMS G1GC (Experimental) Garbage Collectors
  • 24. Major Memory Regions Monitoring the JVM Young PermTenured Code Cache Young generation Further divided into: Eden A “from” survivor space A “to” survivor space Tenured (old) generation Permanent generation/Meta Space Code Cache Heap Non Heap NativeNative
  • 25. Object Life Cycle Most objects are allocated in Eden space. When Eden fills up a minor GC occurs Reachable object are copied “to” survivor space. There are two survivor spaces surviving objects are copied alternately 
 from one to the other. Eden S0 S1 Eden S0 S1 Eden S0 S1 1 2 3
  • 26. Object Promotion Objects are promoted to the old generation (tenured) when: surviving several minor GCs Survival spaces fill up
  • 27. Serial Collectors Single threaded Stop the world Monolithic Efficient (no communication between threads) The default on certain platforms
 Garbage Collectors:Serial
  • 28. Serial New Collector -XX:+ UseSerialGC Serial new Single threaded young generation collector Triggered when: Eden space is full. An explicit invocation or call to System.gc(). Garbage Collectors: Serial
  • 29. MSC (Serial Old) Single threaded tenured generation collector Mark & Sweep Compact Events which initiate a serial collector garbage collection Tenured generation space is unable to satisfy an 
 object promotion coming from young generation. An explicit invocation or call to System.gc(). Garbage Collectors: Serial
  • 30. Serial Collector: Suitability Well suited for single processor core machines CPU affinity (one-to-one JVM to processor core configuration) Tends to work well for applications with small Java heaps, i.e. less than 128mb- 256mb Garbage Collectors:Serial
  • 31. Train Collector Introduced in Java 1.3 Divides the heap into small chunks Incremental Experimental Discontinued on Java 1.4 Garbage Collectors
  • 32. Parallel Collector Multi-threaded Monolithic Stop the world Three variants -XX:+UseParallelGC -XX:+UseParallelOldGC The default on most platforms Garbage Collectors
  • 33. Managing collector threads Number parallel collector threads controlled 
 by -XX:ParallelGCThreads=<N> Defaults to Runtime.availableProcessors(). 
 In a JDK 6 update release, 5/8ths available processors if > 8 Multiple JVM per machine configurations, 
 set -XX: ParallelGCThreads=<N> such that sum of all threads < NCPU Garbage Collectors
  • 34. Parallel GC Triggering Same as serial collector Events which initiate a minor garbage collection Eden space is unable to satisfy an object allocation request. Results in a minor garbage collection event. Events which might initiate a full garbage collection Tenured generation space unable to satisfy an 
 object promotion coming from young generation. An explicit invocation or call to System.gc(). Garbage Collectors
  • 35. Parallel Collector:Suitability Reduce garbage collection overhead 
 on multi-core processor systems Reduce pause time on multi-core systems Best throughput Pause time may be reasonable when heap size < 1GB Garbage Collectors
  • 36. CMS Collector (Mostly) Concurrent mark and sweep tenured space collector Runs mostly concurrently with application threads Do not compact heap! Enabled with -XX:+UseConcMarkSweepGC ParNew - Parallel, multi-threaded young generation collector enabled by default. working with CMS Garbage Collectors
  • 37. Alas! Lower throughput Requires more memory 20-30% more Concurrent mode failure will fallback to a stop the world full GC can occur when : objects are copied to the tenured space faster than the concurrent collector can collect them. (“loses the race”) space fragmentation -XX:PrintFLSStatistics=1 Garbage Collectors
  • 38. Concurrent Collector Phases Concurrent collector cycle contains the following phases: Initial mark (*) Concurrent mark Concurrent Pre-clean Remark (*) - second pass Concurrent sweep Concurrent reset Garbage Collectors
  • 39. The Concurrent Collector Initial mark phase(*) Objects in the tenured generation are “marked” 
 as reachable including those objects which may 
 be reachable from young generation. Pause time is typically short in duration relative 
 to minor collection pause times. Concurrent mark phase Traverses the tenured generation object graph 
 for reachable objects concurrently while Java 
 application threads are executing. Garbage Collectors
  • 40. The Concurrent Collector Phases Remark(*) Finds objects that were missed by the concurrent 
 mark phase due to updates by Java application 
 threads to objects after the concurrent collector 
 had finished tracing that object. Concurrent sweep Collects the objects identified as 
 unreachable during marking phases. Concurrent reset Prepares for next concurrent collection. Garbage Collectors
  • 41. PermGen Collection Classes will not be collected during CMS 
 concurrent phases Only during Full (STW) collection Explicitly instructed to do so using 
 -XX:+CMSClassUnloadingEnabled and 
 -XX:+PermGenSweepingEnabled, 
 (the 2nd switch is not needed in post HotSpot 6.0u4 JVMs). Garbage Collectors
  • 42. Suitability Application responsiveness is more important than application throughput More than one core Large Heaps > 1GB Garbage Collectors
  • 43. ExplicitGC and CMS Explicit GC used to invoke the stop the world GC This can cause a problem with large heaps
 -XX:+ExplicitGCInvokesConcurrent (Java 6) -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses 
 (requires Java6u4 or later). Garbage Collectors
  • 44. CMS Minor Collection Triggering Minor collections are triggered as with the parallel collector The ParNew collector is built in such a way it can work in parallel with the CMS Garbage Collectors
  • 45. CMS Collection Triggering Start if the occupancy exceeds a percentage threshold Default value is 92% -XX:CMSInitiatingOccupancyFraction=n 
 where n is the % of the tenured space size. Garbage Collectors
  • 46. iCMS Deprecated on Java 8 CMSIncrementalMode enables the concurrent 
 modes to be done incrementally. Periodically gives additional processor back to 
 the application resulting in better application responsiveness by doing the concurrent work 
 in small chunks. Garbage Collectors
  • 47. Tune iCMS CMSIncrementalMode has a duty cycle that controls the amount of work the concurrent collector is allowed to do before giving up the processor. Duty cycle is the % of time between minor collections 
 the concurrent collector is allowed to run. Duty cycle by default is automatically computed using what's called automatic pacing. Both duty cycle and pacing can be fine tuned. Garbage Collectors
  • 48. Enabling iCMS Java 6 On JDK 6, recommend using the following two switches together: -XX:+UseConcMarkSweepGC and -XX:+CMSIncrementalMode Or use: -Xincgc Garbage Collectors
  • 49. Enabling iCMS Java5 On JDK 5 use all of the following switches together: -XX:+UseConcmarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 JDK 5 settings mirror the default settings decided upon for JDK 6. JDK 5's -Xincgc!= CMSIncrementalMode, it enables CMS Garbage Collectors
  • 50. iCMS Fine Tuning If full collections are still occurring, then: Increase the safety factor using 
 -XX:CMSIncrementalSafetyFactor=n 
 The default value is 10. Increasing safety factor 
 adds conservatism when computing the duty cycle. Increase the minimum duty cycle using
 -XX:CMSIncrementalDutyCycleMin=n 
 The default is 0 in JDK 6, 10 in JDK 5. Disable automatic pacing and use a fixed duty cycle using 
 -XX:-CMSIncrementalPacing and 
 -XX:CMSIncrementalDutyCycle=n 
 The default duty cycle is 10 in JDK 6, 50 in JDK 5. Garbage Collectors
  • 52. Garbage First GC Stop the world Incremental Beta stage in Java6 and Java 7 Supported since Java7u4 Garbage Collectors: G1
  • 53. Garbage First GC Apply New generation harvesting to the tenured Gen Achieve soft real time goal 
 consume no more than x ms of any y ms time slice while maintaining high throughput for programs with large heaps and high allocation rates, running on large multi-processor machines. Garbage Collectors: G1
  • 54. Heap Layout divided into equal-sized heap regions, each a contiguous range of virtual memory Region size is 1-32MB based on heap size target 2000 regions A linked list of empty regions Heap is divided to New generation regions and old generation regions Garbage Collectors: G1
  • 55. Each region is either Marked Eden Survivor Space Old Generation Empty Humongous Heap Layout Garbage Collectors: G1 E E S S E E E E O O O O O H H H E E E
  • 56. G1 New Gen GC Live objects from young generation are moved to Survivor space regions Old gen regions STW pause Calculate new size of eden and new Survivor space Garbage Collectors: G1
  • 57. G1 Concurrent Mark Triggered when entire heap reaches certain threshold Mark regions Calculate liveliness information for each region Concurrent Empty regions are reclaimed immediately Garbage Collectors: G1
  • 58. OldGen collection Choose regions with low liveliness Piggyback some during next young GC Denoted GC Pause (mixed) Garbage Collectors: G1
  • 59. Humongous Objects 1/2 of the heap region size allocated in dedicated (contiguous sequences of) heap regions; these regions contain only the humongous object GC is not optimized for these objects Garbage Collectors: G1
  • 61. Valid Combinations Tenured Garbage Collectors: G1 Young G1GC Parallel Scavenge ParNewSerial Serial Old CMS Parallel Old
  • 62. Summary: -XX flags for JDK6 Garbage Collectors: G1 Collector -XX: Param "Serial" + "Serial Old“ UseSerialGC "ParNew" + "Serial Old“ UseParNewGC "ParNew"+"CMS" + "Serial Old“ * UseConcMarkSweepGC "Parallel Scavenge" + "Serial Old" UseParallelGC "Parallel Scavenge" + "Parallel Old" UseParallelOldGC G1 Garbage collector ** UseG1GC
  • 64. Azul C4 A proprietary JVM Commercial Pause-less Requires changes in the OS (Linux only) Achieves Throughput and Pause time May require more memory…. 32GB and above Useful for large heaps (1TB is casual) Garbage Collectors: C4
  • 65. Shenandoah JEP 189 for Open JDK Developed in RedHat An Ultra-Low-Pause-Time Garbage Regions same as G1 Concurrent collection w/o memory barrier Not a generational collector Brooks forwarding pointer Garbage Collectors: Shenandoah Ref Object
  • 67. Object Life Cycle Garbage Collectors: References Created Initialized Strongly Reachable Softly Reachable Weakly Reachable Finalized Phantom Reachable
  • 68. Finalizers Create performance issues For example: 
 do not rely on a finalizer to close file descriptors. Try to limit use of finalizer as safety net Use other mechanisms for releasing resources. Keep the work being done as short as possible. Garbage Collectors: References
  • 69. Finalizers and GC Inside a finalizer you have a reference to your object and technically you may resurrect it. Objects which have a finalize method will need two cycles of GC in order to be collected Can lead to OOM errors A resurrected object may be reachable again but it finalize method will not run again. In order to prevent this problem use phantom references…. Garbage Collectors: References
  • 70. Finalizers and Thread Safety Finalizers are executed from a special thread According to the Java memory model
 updates to local variable may not be visible to the Finalization thread Occurs when GC happen too soon In order to ensure correct memory visibility one need to use a sync block to force coherency Garbage Collectors: References
  • 71. Reference Objects drawbacks lots of reference objects also give the garbage collector more work to do since unreachable reference objects need to be discovered and 
 queued during garbage collection. Reference object processing can extend the time 
 it takes to perform garbage collections, especially 
 if there are consistently many unreachable reference objects to process. Garbage Collectors: References
  • 73. Sizing Generations VisualVM's Monitor tab Jconsole's Memory tab VisualGC's heap space sizes GCHisto jstat's GC options -verbose:gc heap space sizes -XX:+PrintGCDetails heap space sizes Tuning Memory: Tuning GC
  • 74. Logging GC activity The three most important parameters are: -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:<logfile> If you want less data -verbose:gc Tuning Memory: Tuning GC
  • 75. Log File Rotation -XX:+UseGCLogFileRotation Control log file size -XX:GCLogFileSize=1000 Number of log files -XX:NumberOfGCLogFiles=10 Starting from Java 7 Tuning Memory: Tuning GC
  • 76. Verbose GC on runtime Java.lang.Memory Set verbose attribute to true Tuning Memory: Tuning GC
  • 77. The GC Log Log every GC occurs PrintGCDetails (more verbose) [GC [DefNew: 960K->64K(960K), 0.0047410 secs] 
 3950K->3478K(5056K), 0.0047900 secs] Verbose:gc (less verbose) [GC 327680K->53714K(778240K), 0.2161340 secs] Before->After(Total), time Tuning Memory: Tuning GC
  • 78. CMS Collector Minor collections follow serial collector format. [Full GC [CMS: 5994K->5992K(49152K),
 0.2584730 secs] 6834K->5992K(63936K), 
 [CMS Perm: 10971K->10971K(18404K)], 0.2586030 secs] [GC [1 CMS-initial-mark: 13991K(20288K)] 
 14103K(22400K), 0.0023781 secs] [CMS-concurrent-preclean: 0.044/0.064 secs] [GC [1 CMS-remark: 16090K(20288K)] 
 17242K(22400K), 0.0210460 secs] Tuning Memory: Tuning GC
  • 79. Additional information More information with the flags -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintTenuringDistribution Tuning Memory: Tuning GC
  • 80. GC Log Quirks CMS GC logs may get garbled Due to concurrency of the different GC mechanisms Analysis tools should be able to overcome this problem Not easy at all and may need to fix manually. Tuning Memory: Tuning GC
  • 81. J9 -Xverbosegclog:<file path> Structured xml Tuning Memory: Tuning GC <gc-end id="92" type="scavenge" contextid="88" durationms="4.464" usertimems="4.000" systemtimems="0.000" timestamp="2014-01-21T16:37:18.953"> <mem-info id="93" free="5472664" total="8388608" percent="65"> <mem type="nursery" free="655360" total="2097152" percent="31"> <mem type="allocate" free="655360" total="1179648" percent="55" /> <mem type="survivor" free="0" total="917504" percent="0" /> </mem> <mem type="tenure" free="4817304" total="6291456" percent="76"> <mem type="soa" free="4502936" total="5977088" percent="75" /> <mem type="loa" free="314368" total="314368" percent="100" /> </mem> <pending-finalizers system="2" default="0" reference="24" classloader="0" /> <remembered-set count="1770" /> </mem-info> </gc-end>
  • 83. JClarity: Cesnum Commercial product Give recommendations Tuning Memory: Tuning GC
  • 84. GCViewer Open source originally by Tagtraum industry A new fork exists in github latest version 1.33 https://github.com/chewiebug/GCViewer Supports G1GC Java6/7 Not bullet proof Tuning Memory: Tuning GC
  • 86. GCViewer View Choose which info to 
 view Tuning Memory: Tuning GC
  • 88. GCViewer - Memory Tuning Memory: Tuning GC
  • 91. Explicit GC Do not use System.gc() unless there is a specific 
 use case or need to. Disable Explicit GC: -XX:+DisableExplicitGC. Default RMI distributed GC interval is once per minute, (60000 ms). Use: 
 -Dsun.rmi.dgc.client.gcInterval =3600000 
 -Dsun.rmi.dgc.server.gcInterval =3600000 When using JDK 6 and the Concurrent collector also use 
 -XX:+ExplicitGCInvokesConcurrent Tuning Memory: Tuning GC
  • 93. The Jigsaw effect Garbage collection makes heap diagrams over time look like a jigsaw. Minor collections visualize as small teeth Full collections visualize as large teeth Tuning Memory: Tuning GC
  • 94. Measuring Memory Usage Best way is heap dump From GC logs look on the lower points of the Full GC lines. Tuning Memory: Tuning GC 60.0 90.0 120.0 150.0 180.0
  • 95. Memory Leak Draw the line between the full gc points Increasing over time --> memory leak Tuning Memory: Tuning GC 60.0 90.0 120.0 150.0 180.0
  • 96. Low Throughput Under 95% Increase Heap: Low throughput is usually a result of insufficient memory GC kicks in too frequently and frees small amounts New Gen too small New gen GC kicks in too fast Not able to release enough as a result mid term objects spill to old gen. Old Gen too small Application state spills to new Gen. Breaks the Generational Hypothesis Tuning Memory: Tuning GC
  • 97. Contradicting Goals Goal1: Retain as many objects as possible in the survivor spaces Less promotion into the old generation Less frequent old GCs Goal2: Do not copy very long- lived objects between the survivors Unnecessary overhead on minor GCs Tuning Memory: Tuning GC
  • 98. High Pause Time Choose The correct GC scheme: CMS/G1 when low pause is required Reduce footprint of user interactive processes Reduce memory allocations do not allocated too many related temporary objects chunking of work. Tuning Memory: Tuning GC