SlideShare a Scribd company logo
Exploring .NET memory
management
A trip down memory lane
Maarten Balliauw
@maartenballiauw
Who am I?
Maarten Balliauw
Antwerp, Belgium
Developer Advocate, JetBrains
Founder, MyGet
AZUG
Focus on web
ASP.NET MVC, Azure, SignalR, ...
Former MVP Azure & ASPInsider
Big passion: Azure
http://blog.maartenballiauw.be
@maartenballiauw
Agenda
Garbage Collector
Helping the GC
Allocations & hidden allocations
Strings
Exploring the heap
Garbage Collector
.NET runtime
Manages execution of programs
Just-in-time compilation: Intermediate Language (IL) ->machine code
Type safety
Exception handling
Security
Thread management
Memory management
Garbage collection (GC)
Memory management and GC
Virtually unlimited memory for our applications
Big chunk of memory pre-allocated
Runtime manages allocation in that chunk
Garbage Collector (GC) reclaims unused memory, making it available again
.NET memory management 101
Memory allocation
“Managed heap” - region of address space for every new process
Objects allocated in the heap
Allocating memory is fast, it’s just adding a pointer
Some unmanaged memory is also consumed (not GC-ed)
.NET CLR, Dynamic libraries, Graphics buffer, …
Memory release or “Garbage Collection” (GC)
Generations
Large Object Heap
.NET memory management 101
Memory allocation
Memory release or “Garbage Collection” (GC)
GC releases objects no longer in use by examining application roots
GC builds a graph of all the objects that are reachable from these roots
Object unreachable? Remove object, release memory, compact heap
Takes time to scan all objects!
Generations
Large Object Heap
.NET memory management 101
Memory allocation
Memory release or “Garbage Collection” (GC)
Generations – be fast and efficient!
Managed heap divided in segments: generation 0, 1 and 2
Gen0 – run GC more often, object in use -> move to gen1
Gen 1 – run GC less often, object in use -> move to gen2
Gen 2 – run GC even less often, object in use stays in gen2
Reduce time needed to perform GC
Large Object Heap
.NET memory management 101
Memory allocation
Memory release or “Garbage Collection” (GC)
Generations
Large Object Heap
Generation 0 Generation 1 Generation 2
Short-lived objects (e.g. Local
variables)
In-between objects Long-lived objects (e.g. App’s
main form)
.NET memory management 101
Memory allocation
Memory release or “Garbage Collection” (GC)
Generations
Large Object Heap (LOH)
Special segment for large objects (>85KB)
Collected only during full garbage collection
Not compacted (by default)
Fragmentation can cause OutOfMemoryException
The .NET garbage collector
When does it run? Vague… But usually:
Out of memory condition – when the system fails to allocate or re-allocate
memory
After some significant allocation – if X memory is allocated since previous GC
Failure of allocating some native resources – internal to .NET
Profiler – when triggered from profiler API
Forced – when calling methods on System.GC
Application moves to background
GC is not guaranteed to run
http://blogs.msdn.com/b/oldnewthing/archive/2010/08/09/10047586.aspx
http://blogs.msdn.com/b/abhinaba/archive/2008/04/29/when-does-the-net-compact-framework-garbage-collector-run.aspx
The .NET garbage collector
Runs very often for gen0
Short-lived objects, few references, fast to clean
Local variable
Web request/response
Higher generation
Usually more references, slower to clean
GC pauses the running application to do its thing
Usually short, except when not…
Background GC (enabled by default)
Concurrent with application threads
May still introduce short locks/pauses, usually just for one thread
Helping the GC, avoid pauses
Optimize allocations (but don’t do premature optimization – measure!)
Don’t allocate at all
Make use of IDisposable / using statement
Clean up references, giving the GC an easy job
Finalizers
Beware! Moved to finalizer queue -> always gen++
Weak references
Allow the GC to collect these objects, no need for checks
Helping the GC
DEMO
Allocations
When is memory allocated?
Not for value types (int, bool, struct, decimal, enum, float, byte, long, …)
Allocated on stack, not on heap
Not managed by garbage collector
For reference types
When you new
When you ""
Hidden allocations!
Boxing!
Put and int in a box
Take an int out of a box
Lambda’s/closures
Allocate compiler-generated
DisplayClass to capture state
Params arrays
And more!
int i = 42;
// boxing - wraps the value type in an "object box"
// (allocating a System.Object)
object o = i;
// unboxing - unpacking the "object box" into an int again
// (CPU effort to unwrap)
int j = (int)o;
How to find them?
Experience
Intermediate Language (IL)
Profiler
“Heap allocations viewer”
ReSharper Heap Allocations Viewer plugin
Roslyn’s Heap Allocation Analyzer
Don’t do premature optimization – measure!
Hidden allocations
DEMO
ReSharper Heap Allocations Viewer plugin
Roslyn’s Heap Allocation Analyzer
Don’t optimize what
should not be
optimized.
Measure!
We know when allocations are done...
...but perhaps these don’t matter.
Measure!
How frequently are we allocating?
How frequently are we collecting?
What generation do we end up on?
Are our allocations introducing pauses?
www.jetbrains.com/dotmemory (and www.jetbrains.com/dottrace)
Always Be Measuring
DEMO
Object pools / object re-use
If it make sense, re-use objects
Fewer allocations, fewer objects for the GC to scan
Fewer memory traffic that can trigger a full GC
Object pooling - object pool pattern
Create a pool of objects that can be cleaned and re-used
https://www.codeproject.com/articles/20848/c-object-pooling
“Optimize ASP.NET Core” - https://github.com/aspnet/AspLabs/issues/3
Garbage Collector & Allocations
GC is optimized for high memory traffic in short-lived objects
Use that knowledge! Don’t fear allocations!
Don’t optimize what should not be optimized…
GC is the concept that makes .NET / C# tick – use it!
Know when allocations happen
GC is awesome
Gen2 collection that stop the world not so much…
Measure!
Strings
Strings are objects
.NET tries to make them look like a value type, but they are a reference type
Read-only collection of char
Length property
A bunch of operator overloading
Allocated on the managed heap
var a = new string('-', 25);
var b = "Hello, World!";
var c = httpClient.GetStringAsync("http://blog.maartenballiauw.be");
Measuring string
allocations
DEMO
String duplicates
Any .NET application has them (System.Globalization duplicates quite a few)
Are they bad?
.NET GC is fast for short-lived objects, so meh.
Don’t waste memory with string duplicates on gen2
(it’s okay to have strings there)
String literals
Are all strings on the heap? Are all strings duplicated?
var a = "Hello, World!";
var b = "Hello, World!";
Console.WriteLine(a == b);
Console.WriteLine(Object.ReferenceEquals(a, b));
Prints true twice. So “Hello World” only in memory once?
Portable Executable (PE)
#UserStrings
DEMO
String literals in #US
Compile-time optimization
Store literals only once in PE header metadata stream ECMA-335 standard, section II.24.2.4
Reference literals (IL: ldstr)
var a = Console.ReadLine();
var b = Console.ReadLine();
Console.WriteLine(a == b);
Console.WriteLine(Object.ReferenceEquals(a, b));
String interning to the rescue!
String interning
Store (and read) strings from the intern pool
Simply call String.Intern when “allocating” or reading the string
Scans intern pool and returns reference
var url = string.Intern("http://blog.maartenballiauw.be");
var stringList = new List<string>();
for (int i = 0; i < 1000000; i++)
{
stringList.Add(url);
}
String interning caveats
Why are not all strings interned by default?
CPU vs. memory
Not on the heap but on intern pool
No GC on intern pool – all strings in memory for AppDomain lifetime!
Rule of thumb
Lot of long-lived, few unique -> interning good
Lot of long-lived, many unique -> no benefit, memory growth
Lot of short-lived -> trust the GC
Measure!
Exploring the heap
for fun and profit
How would you do it...
Build a managed type system, store in memory, CPU/memory friendly
Probably:
Store type info (what’s in there, what’s the offset of fieldN, …)
Store field data (just data)
Store method pointers (“who you gonna call?”)
Inheritance information
Stuff on the Stack
Stuff on the Managed Heap
(scroll down for more...)
IT is just mapping mappings.
Pointer to an “instance”
Instance
Pointer to Runtime Type Information (RTTI)
Field values (which can be pointers in turn)
RTTI
Interface addresses
Instance method addresses
Static method addresses
…
Theory is nice...
Microsoft.Diagnostics.Runtime (ClrMD)
“ClrMD is a set of advanced APIs for programmatically inspecting a crash dump of
a .NET program much in the same way that the SOS Debugging Extensions (SOS)
do. This allows you to write automated crash analysis for your applications as well
as automate many common debugger tasks. In addition to reading crash dumps
ClrMD also allows supports attaching to live processes.”
Maarten’s definition: “LINQ-to-heap”
ClrMD introduction
DEMO
String duplicates
DEMO
“Path to root” (why is
my object held in
memory)
DEMO
Conclusion
Conclusion
Garbage Collector (GC) optimized for high memory traffic + short-lived objects
Don’t fear allocations! But beware of gen2 “stop the world”
String interning when lot of long-lived, few unique
Don’t optimize what should not be optimized…
Measure!
Using a profiler/memory analysis tool
ClrMD to automate inspections
Thank you!
Need training, coaching, mentoring, performance analysis?
Hire me! https://blog.maartenballiauw.be/hire-me.html
http://blog.maartenballiauw.be
@maartenballiauw

More Related Content

Exploring .NET memory management - A trip down memory lane - Copenhagen .NET User Group

  • 1. Exploring .NET memory management A trip down memory lane Maarten Balliauw @maartenballiauw
  • 2. Who am I? Maarten Balliauw Antwerp, Belgium Developer Advocate, JetBrains Founder, MyGet AZUG Focus on web ASP.NET MVC, Azure, SignalR, ... Former MVP Azure & ASPInsider Big passion: Azure http://blog.maartenballiauw.be @maartenballiauw
  • 3. Agenda Garbage Collector Helping the GC Allocations & hidden allocations Strings Exploring the heap
  • 5. .NET runtime Manages execution of programs Just-in-time compilation: Intermediate Language (IL) ->machine code Type safety Exception handling Security Thread management Memory management Garbage collection (GC)
  • 6. Memory management and GC Virtually unlimited memory for our applications Big chunk of memory pre-allocated Runtime manages allocation in that chunk Garbage Collector (GC) reclaims unused memory, making it available again
  • 7. .NET memory management 101 Memory allocation “Managed heap” - region of address space for every new process Objects allocated in the heap Allocating memory is fast, it’s just adding a pointer Some unmanaged memory is also consumed (not GC-ed) .NET CLR, Dynamic libraries, Graphics buffer, … Memory release or “Garbage Collection” (GC) Generations Large Object Heap
  • 8. .NET memory management 101 Memory allocation Memory release or “Garbage Collection” (GC) GC releases objects no longer in use by examining application roots GC builds a graph of all the objects that are reachable from these roots Object unreachable? Remove object, release memory, compact heap Takes time to scan all objects! Generations Large Object Heap
  • 9. .NET memory management 101 Memory allocation Memory release or “Garbage Collection” (GC) Generations – be fast and efficient! Managed heap divided in segments: generation 0, 1 and 2 Gen0 – run GC more often, object in use -> move to gen1 Gen 1 – run GC less often, object in use -> move to gen2 Gen 2 – run GC even less often, object in use stays in gen2 Reduce time needed to perform GC Large Object Heap
  • 10. .NET memory management 101 Memory allocation Memory release or “Garbage Collection” (GC) Generations Large Object Heap Generation 0 Generation 1 Generation 2 Short-lived objects (e.g. Local variables) In-between objects Long-lived objects (e.g. App’s main form)
  • 11. .NET memory management 101 Memory allocation Memory release or “Garbage Collection” (GC) Generations Large Object Heap (LOH) Special segment for large objects (>85KB) Collected only during full garbage collection Not compacted (by default) Fragmentation can cause OutOfMemoryException
  • 12. The .NET garbage collector When does it run? Vague… But usually: Out of memory condition – when the system fails to allocate or re-allocate memory After some significant allocation – if X memory is allocated since previous GC Failure of allocating some native resources – internal to .NET Profiler – when triggered from profiler API Forced – when calling methods on System.GC Application moves to background GC is not guaranteed to run http://blogs.msdn.com/b/oldnewthing/archive/2010/08/09/10047586.aspx http://blogs.msdn.com/b/abhinaba/archive/2008/04/29/when-does-the-net-compact-framework-garbage-collector-run.aspx
  • 13. The .NET garbage collector Runs very often for gen0 Short-lived objects, few references, fast to clean Local variable Web request/response Higher generation Usually more references, slower to clean GC pauses the running application to do its thing Usually short, except when not… Background GC (enabled by default) Concurrent with application threads May still introduce short locks/pauses, usually just for one thread
  • 14. Helping the GC, avoid pauses Optimize allocations (but don’t do premature optimization – measure!) Don’t allocate at all Make use of IDisposable / using statement Clean up references, giving the GC an easy job Finalizers Beware! Moved to finalizer queue -> always gen++ Weak references Allow the GC to collect these objects, no need for checks
  • 17. When is memory allocated? Not for value types (int, bool, struct, decimal, enum, float, byte, long, …) Allocated on stack, not on heap Not managed by garbage collector For reference types When you new When you ""
  • 18. Hidden allocations! Boxing! Put and int in a box Take an int out of a box Lambda’s/closures Allocate compiler-generated DisplayClass to capture state Params arrays And more! int i = 42; // boxing - wraps the value type in an "object box" // (allocating a System.Object) object o = i; // unboxing - unpacking the "object box" into an int again // (CPU effort to unwrap) int j = (int)o;
  • 19. How to find them? Experience Intermediate Language (IL) Profiler “Heap allocations viewer” ReSharper Heap Allocations Viewer plugin Roslyn’s Heap Allocation Analyzer Don’t do premature optimization – measure!
  • 20. Hidden allocations DEMO ReSharper Heap Allocations Viewer plugin Roslyn’s Heap Allocation Analyzer
  • 21. Don’t optimize what should not be optimized.
  • 22. Measure! We know when allocations are done... ...but perhaps these don’t matter. Measure! How frequently are we allocating? How frequently are we collecting? What generation do we end up on? Are our allocations introducing pauses? www.jetbrains.com/dotmemory (and www.jetbrains.com/dottrace)
  • 24. Object pools / object re-use If it make sense, re-use objects Fewer allocations, fewer objects for the GC to scan Fewer memory traffic that can trigger a full GC Object pooling - object pool pattern Create a pool of objects that can be cleaned and re-used https://www.codeproject.com/articles/20848/c-object-pooling “Optimize ASP.NET Core” - https://github.com/aspnet/AspLabs/issues/3
  • 25. Garbage Collector & Allocations GC is optimized for high memory traffic in short-lived objects Use that knowledge! Don’t fear allocations! Don’t optimize what should not be optimized… GC is the concept that makes .NET / C# tick – use it! Know when allocations happen GC is awesome Gen2 collection that stop the world not so much… Measure!
  • 27. Strings are objects .NET tries to make them look like a value type, but they are a reference type Read-only collection of char Length property A bunch of operator overloading Allocated on the managed heap var a = new string('-', 25); var b = "Hello, World!"; var c = httpClient.GetStringAsync("http://blog.maartenballiauw.be");
  • 29. String duplicates Any .NET application has them (System.Globalization duplicates quite a few) Are they bad? .NET GC is fast for short-lived objects, so meh. Don’t waste memory with string duplicates on gen2 (it’s okay to have strings there)
  • 30. String literals Are all strings on the heap? Are all strings duplicated? var a = "Hello, World!"; var b = "Hello, World!"; Console.WriteLine(a == b); Console.WriteLine(Object.ReferenceEquals(a, b)); Prints true twice. So “Hello World” only in memory once?
  • 32. String literals in #US Compile-time optimization Store literals only once in PE header metadata stream ECMA-335 standard, section II.24.2.4 Reference literals (IL: ldstr) var a = Console.ReadLine(); var b = Console.ReadLine(); Console.WriteLine(a == b); Console.WriteLine(Object.ReferenceEquals(a, b)); String interning to the rescue!
  • 33. String interning Store (and read) strings from the intern pool Simply call String.Intern when “allocating” or reading the string Scans intern pool and returns reference var url = string.Intern("http://blog.maartenballiauw.be"); var stringList = new List<string>(); for (int i = 0; i < 1000000; i++) { stringList.Add(url); }
  • 34. String interning caveats Why are not all strings interned by default? CPU vs. memory Not on the heap but on intern pool No GC on intern pool – all strings in memory for AppDomain lifetime! Rule of thumb Lot of long-lived, few unique -> interning good Lot of long-lived, many unique -> no benefit, memory growth Lot of short-lived -> trust the GC Measure!
  • 35. Exploring the heap for fun and profit
  • 36. How would you do it... Build a managed type system, store in memory, CPU/memory friendly Probably: Store type info (what’s in there, what’s the offset of fieldN, …) Store field data (just data) Store method pointers (“who you gonna call?”) Inheritance information
  • 37. Stuff on the Stack
  • 38. Stuff on the Managed Heap (scroll down for more...)
  • 39. IT is just mapping mappings. Pointer to an “instance” Instance Pointer to Runtime Type Information (RTTI) Field values (which can be pointers in turn) RTTI Interface addresses Instance method addresses Static method addresses …
  • 40. Theory is nice... Microsoft.Diagnostics.Runtime (ClrMD) “ClrMD is a set of advanced APIs for programmatically inspecting a crash dump of a .NET program much in the same way that the SOS Debugging Extensions (SOS) do. This allows you to write automated crash analysis for your applications as well as automate many common debugger tasks. In addition to reading crash dumps ClrMD also allows supports attaching to live processes.” Maarten’s definition: “LINQ-to-heap”
  • 43. “Path to root” (why is my object held in memory) DEMO
  • 45. Conclusion Garbage Collector (GC) optimized for high memory traffic + short-lived objects Don’t fear allocations! But beware of gen2 “stop the world” String interning when lot of long-lived, few unique Don’t optimize what should not be optimized… Measure! Using a profiler/memory analysis tool ClrMD to automate inspections
  • 46. Thank you! Need training, coaching, mentoring, performance analysis? Hire me! https://blog.maartenballiauw.be/hire-me.html http://blog.maartenballiauw.be @maartenballiauw

Editor's Notes

  1. https://pixabay.com/en/memory-computer-component-pcb-1761599/
  2. https://pixabay.com/en/tires-used-tires-pfu-garbage-1846674/
  3. Application roots: Typically, these are global and static object pointers, local variables, and CPU registers.
  4. Application roots: Typically, these are global and static object pointers, local variables, and CPU registers. The GC runs very often on gen0, as short-lived objects usually have few other objects pointing to them and making cleanup quite fast - think objects used within the scope of a method, or a web request that allocates some objects that are obsolete once the response is rendered. The longer an object remains in memory, the more difficult it tends to become to cleanup the object, so the garbage collector runs less on gen1, and even less on gen2. Objects in these generations may live longer, so it makes no sense to check them all every time the GC runs. Running the GC means consuming CPU and freezing your application. Usually very short, but I’ve seen GC cycles of several seconds on big server applications - blocking incoming requests.
  5. Application roots: Typically, these are global and static object pointers, local variables, and CPU registers.
  6. Application roots: Typically, these are global and static object pointers, local variables, and CPU registers.
  7. Open TripDownMemoryLane.sln Show WeakReferenceDemo (demo “1-1”) Explain weak reference allows GC to collect reference Show Cache object – has weak references to data, we expect these to probably be cleaned up by GC Attach profiler, run demo “1-1”, snapshot, see 20 instances of WeakReference<Data> Snapshot again, compare – see WeakReference<Data> has been regenerated a couple of times Show DisposeObjectsDemo (demo “1-2”) Explain first demo does not dispose and relies on GC + finalizers. This will mean our object remains in memory for two GC cycles! Explain dispose does clean them up and requires only one cycle In SampleDisposable, explain GC.SuppressFinalize -> tell the GC no finalizer queue work is needed here!
  8. Open TripDownMemoryLane.sln Show Demo02_Random Open IL viewer tool window, show what happens in IL for each code sample Explain IL viewer + hovering statements to see what they do BoxingRing() – show boxing and unboxing statements in IL, explain they consume CPU and allocate an object ParamsArray() – the call to ParamsArrayImpl() actually allocates a new string array! CPU + memory AverageWithinBounds() – temporary class is created to capture state of all variables, then passed around IL_0000: newobj instance void TripDownMemoryLane.Demo02.Demo02_Random/'<>c__DisplayClass3_0'::.ctor() Lambdas() – same thing, temporary class to capture state in the loop IL_001f: newobj instance void Allocatey.Talk.Demo02_Random/'<>c__DisplayClass4_0'::.ctor() Show Demo02_ValidateArgumentsDemo – this one is fun! Explain what we want to do: build a guard function – check a condition, show error First one is the easy one, but it allocates a string and runs string.Format Second one is better – does not allocate the string! But does allocate a function and a state capture... Third one – allocates an array (params) Fourth one – no allocations, yay! Using overloads... Show heap allocations viewer!
  9. Open TripDownMemoryLane.sln Show BeersDemoUnoptimized (demo “3-1” and “3-2”) Explain we’re building an application that shows all beers in the world and their ratings Stored in beers.json (show document) with beer name, brewery, number of votes For a view in our application, read this file into a multi-dimensional dictionary that contains breweries, beers, and their rating Show BeerLoader and note the dictionary format Show LoadBeersInsane and explain this is BAD BAD BAD because of the high memory usage Show LoadBeersUnoptimized, explain what it does, optimized against the insane version as we’re streaming over our file Load beers a number of times Inspect snapshots GC is very visible Most memory in gen2 (we keep our beers around) Compare two snapshots: high traffic on dictionary items (Lots of string allocations - JSON.NET) Show LoadBeersOptimized, explain what it does, re-using dictionary and updating items as we read the JSON Load beers a number of times Inspect snapshots GC is almost invisible Less allocations happening Compare two snapshots: almost no traffic Less work for GC, less pauses! Measure and make it look good!
  10. There is an old adage in IT that says “don’t do premature optimization”. In other words: maybe some allocations are okay to have, as the GC will take care of cleaning them up anyway. While some do not agree with this, I believe in the middle ground. The garbage collector is optimized for high memory traffic in short-lived objects, and I think it’s okay to make use of what the .NET runtime has to offer us here. If it’s in a critical path of a production application, fewer allocations are better, but we can’t write software with zero allocations - it’s what our high-level programming language uses to make our developer life easier. It’s not okay to have objects go to gen2 and stay there when in fact they should be gone from memory. Learn where allocations happen, using any of the above methods, and profile your production applications frequently to see if there are large objects in higher generations of the heap that don’t belong there.
  11. Open TripDownMemoryLane.sln Show StringAllocationsDemo (demo “4”) Show AllocateSomeStrings, mention a few strings will be allocated (a, b and c) AllocateSomeStringDuplicates – same thing, but a lot of strings! In loop, every string wil be added to memory, crazy! Run with dotMemory attached, capture snapshot See string duplicates! Just for fun, attach to devenv.exe 
  12. Will print “true” twice.
  13. Open our demo application in dotPeek Explain PE headers Show #US table Open StringAllocationDemo class. Jump to IL code, show ldstr statement for strings that are in #US table
  14. Code = trick question, what if we enter same value twice? String equals, reference not equals!
  15. How many strings are stored
  16. How many strings are stored
  17. Open ClrMD.sln Explain: two projects, one target application, one running ClrMD to analyze what we have Open ClrMD.Explorer.Program, show attaching ClrMD Get CLR version – gets info about the current CLR version Get runtime – gets info about the actual runtime hosting our app Show DumpClrInfo – get info, stress DAC data access components location – defines the runtime structures, used by ClrMD and VS Debugger etc to explore runtime while debugging/profiling/... Explore DumpHeapObjects, stress the heap structure Loop object addresses - foreach (var objectAddress in generation) Get type of object at address - var type = heap.GetObjectType(objectAddress.Ptr); Use type info to get value - type.GetValue(objectAddress.Ptr) Explore type autocomplete – structure to get enum, method addresses, ...
  18. Open ClrMD.sln Show DumpStringDuplicates Count total strings For each string, store value + count Dump to console
  19. Open ClrMD.sln Run ClrMd.Target with dotMemory attached Show Clock object retention path Explain what this means (object held in memory because...) Show ClrMd.Target code, explain in code Can we build this type of analysis ourselves? Yes we can! Show DumpRetention Enumerate all objects, find our Clock object (get type of object at address, compare) When we have the address of our object, enumerate all object roots (all trees of objects that are in use) Walk all of these trees and find our object address If found, we’re done! Run it, show output, show DGML output as well