Scratch memory in a managed environment

Question

I'm working on a fluid simulation in C#. Each cycle I need to calculate the velocity of the fluid at discrete points in space. As part of that calculation, I need a few tens of kilobytes for scratch space to hold some double[] arrays (the exact size of the arrays depends on some input data). The arrays are only needed for the duration of the method that uses them, and there are a few different methods that needs scratch space like this.

As I see it, there are a few different solutions for constructing the scratch arrays:

Use 'new' to grab memory from the heap every time the method is called. This is what I was doing at first, however it puts a lot of pressure on the garbage collector, and the few-ms spikes once or twice a second are really annoying.
Have the scratch arrays passed as parameters when calling the methods. Problem is that this forces the user to manage them, including sizing them appropriately, which is a huge pain. And it makes using more or less scratch memory difficult since it changes the API.
Use stackalloc in an unsafe context to allocate the scratch memory from the program stack. This would work just fine, except I'd need to compile with /unsafe and constantly sprinkle unsafe blocks throughout my code, which I'd like to avoid.
Preallocate private arrays once when the program starts up. This is fine except I don't necessarily know the size of the arrays I need until I can look at some of the input data. And it gets really messy since you can't limit the scope of these private variables to just a single method, so they're constantly polluting the namespace. And it scales poorly as the number of methods that need scratch memory increases, since I'm allocating a lot of memory that's only used a fraction of the time.
Create some kind of central pool, and allocate scratch memory arrays from the pool. The main problem with this is that I don't see an easy way to allocate dynamically sized arrays from a central pool. I could use a starting offset and length and have all the scratch memory essentially share a single large array, but I have a lot of existing code that assumes double[]s. And I'd have to be careful to make such a pool thread safe.

...

Does anyone have any experience with a similar problem? Any advice/lessons to offer from the experience?

Did you really mean a few tens of kilobytes? Because that's such a tiny amount, I'd not worry about memory management for that... — Kal_Torak, Commented Mar 31, 2013 at 4:08
It doesn't sound like a lot, but if I'm running 2000 cycles/sec, suddenly it's something like 60MB/sec, and the GC starts to notice. — Jay Lemmon, Commented Mar 31, 2013 at 4:15
Hate to be captain hindsight, but if your application is a hard real time application, C# wasn't a good language choice to begin with stackoverflow.com/questions/3762081/… — Filip, Commented Mar 31, 2013 at 10:39
Managed languages have some peculiarities you have to handle, but really you're overstating the case. Barring SIMD, you can get the exact same performance from C# as from C if you're careful. Anyway, I'm not interested in a language debate; it's not germane to my original question. — Jay Lemmon, Commented Mar 31, 2013 at 11:48
A possible implementation of #3 would be to use a custom attribute and then to use a rewriting tool to rewrite your code to use stackalloc. Until compilation your own code will act as though you were just using local arrays. This shifts most of the ugliness of stackalloc out of your code and into an extra compilation step. Of course, the existing answers avoid this ugliness entirely. I much prefer Eric and Tom's approaches. — Brian, Commented Mar 31, 2013 at 18:21

Michael Stum · Accepted Answer · 2013-03-31 21:03:59Z

I sympathize with your situation; when I was working on Roslyn we considered very carefully the potential performance problems caused by collection pressure from allocating temporary working arrays. The solution we settled on was a pooling strategy.

In a compiler the array sizes tend to be small and therefore repeated frequently. In your situation, if you have large arrays then what I would do is follow Tom's suggestion: simplify the management problem and waste some space. When you ask the pool for an array of size x, round x up to the nearest power of two and allocate an array of that size, or take one from the pool. The caller gets an array that is a little bit too big, but they can be written to deal with that. It shouldn't be too hard to search the pool for an array of the appropriate size. Or you could maintain a bunch of pools, one pool for arrays of size 1024, one for 2048, and so on.

Writing a threadsafe pool is not too hard, or you can make the pool thread static and have one pool per thread.

The tricky bit is getting memory back in the pool. There are a couple ways to deal with that. First, you could simply require the user of the pooled memory to call a "back in the pool" method when they're done with the array, if they don't want to take on the expense of collection pressure.

Another way is to write a facade wrapper around the array, make it implement IDisposable so that you can use "using" (*), and make a finalizer on that which puts the object back in the pool, resurrecting it. (Make sure to have the finalizer turn back on the "I need to be finalized" bit.) Finalizers that do resurrection make me nervous; I would personally prefer the former approach, which is what we did in Roslyn.

(*) Yes, this violates the principle that "using" should indicate that an unmanaged resource is being returned to the operating system. Essentially we're treating managed memory as an unmanaged resource by doing our own management, so it's not so bad.

Thanks, this is exactly the sort of war story I was hoping for :) A pool is sounding like the most reasonable solution so far, and I don't mind "wasting" the memory like this. I'm still not super sold on relying on users to release back the resource, or trying to cram it in to a Dispose method, but I guess it is what it is. — Jay Lemmon, Commented Mar 31, 2013 at 20:03

Community · Accepted Answer · 2017-05-23 11:51:32Z

~~You can wrap the code that uses theses scratch arrays in a using statement like this:~~

~~using(double[] scratchArray = new double[buffer]) { // Code here... }~~

~~This will free the memory explicitly by calling the descructor at the end of the using statement.~~

Unfortunately it appears the above is not true! In lieu of that, you can try something along the lines of having a helper function that returns an array of the appropriate size (closest power of 2 greater than the size), and if it doesn't exist, creating it. This way, you'd only have a logarithmic number of arrays. If you wanted it to be thread-safe though you would need to go to a bit more trouble.

It could look something like this: (using pow2roundup from Algorithm for finding the smallest power of two that's greater or equal to a given value)

private static Dictionary<int,double[]> scratchArrays = new Dictionary<int,double[]>();
/// Round up to next higher power of 2 (return x if it's already a power of 2).
public static int Pow2RoundUp (int x)
{
    if (x < 0)
        return 0;
    --x;
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    return x+1;
}
private static double[] GetScratchArray(int size)
{
    int pow2 = Pow2RoundUp(size);
    if (!scratchArrays.ContainsKey(pow2))
    {
        scratchArrays.Add(pow2, new double[pow2]);
    }
    return scratchArrays[pow2];
}

Edit: Thread-safe version: This will still have things that get garbage collected, but it's going to be thread-specific and should be much less overhead.

[ThreadStatic]
private static Dictionary<int,double[]> _scratchArrays;

private static Dictionary<int,double[]> scratchArrays
{
    get
    {
        if (_scratchArrays == null)
        {
            _scratchArrays = new Dictionary<int,double[]>();
        }
        return _scratchArrays;
    }
}

/// Round up to next higher power of 2 (return x if it's already a power of 2).
public static int Pow2RoundUp (int x)
{
    if (x < 0)
        return 0;
    --x;
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    return x+1;
}
private static double[] GetScratchArray(int size)
{
    int pow2 = Pow2RoundUp(size);
    if (!scratchArrays.ContainsKey(pow2))
    {
        scratchArrays.Add(pow2, new double[pow2]);
    }
    return scratchArrays[pow2];
}

Unfortunately, Disposed != Collected :( See stackoverflow.com/questions/655902/using-and-garbage-collection — Jay Lemmon, Commented Mar 31, 2013 at 4:26
That is unfortunate. I'll update my answer with another idea. — Tom Jacques, Commented Mar 31, 2013 at 4:28
Ah, I didn't know about ThreadStatic. That does make thread-local-ish scratch memory much easier. — Jay Lemmon, Commented Mar 31, 2013 at 11:15

Collectives™ on Stack Overflow

Scratch memory in a managed environment

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
c#
memory-management
managed
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged c#memory-managementmanaged or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
c#
memory-management
managed
or ask your own question.