2
$\begingroup$

The Dolphin emulator for GameCube/Wii has the ability to use the ARB_buffer_storage (or EXT_buffer_storage for GLES) to improve rendering performance.

From the extension's description, a GPU driver implementation provides a means to allocate storage that can be marked as immutable, i.e. cannot be deallocated or resized while in use. What is required of the driver implementation for this to actually be useful? At one extreme, a naive implementation could allocate immutable memory buffers exactly the same way it allocates typical global memory buffers, functionally correct but not beneficial for performance.

At least one user has suggested buffer_storage only matters in the case of discrete GPUs which carry their own dedicated RAM storage.

On the other hand, the Dolphin team continually requested buffer_storage for Mali's driver over the years, which suggests there is a performance improvement even in SoCs with integrated graphics and no dedicated GPU DRAM. If so, is the benefit at all comparable and how does the driver achieve it?

Does Vulkan have an equivalent or somehow avoid the need for such an extension?

The intent here is to understand the value in implementing EXT_buffer_storage in the driver for Broadcom's VideoCore VI.

[1] "Dolphin on the pi 4"

[2] "Will there be EXT_buffer_storage at all?"

[3] "EXT_buffer_storage is now supported by ARM"

[4] " Possible to add GLES EXT_buffer_storage driver support to Videocore VI on the Pi4?"

$\endgroup$
1
  • $\begingroup$ "that can be marked as immutable (constant)" No, "constant" and "immutable" are different things. "Immutable" means that you cannot reallocate the memory; "constant" means you cannot change the values stored in the memory. Not the same thing at all. $\endgroup$ Commented Sep 15, 2019 at 13:59

1 Answer 1

4
$\begingroup$

The marquee feature of buffer storage is not immutability of the allocation itself, but instead is a feature you couldn't have without immutable allocation: persistently mapped buffers.

Pre-buffer_storage, you could not use a buffer while it was mapped. This is done to allow implementations the freedom to play games with mapping behind your back. For example, it's legal for glMapBufferRange to not actually return a mapped pointer; it could just return some CPU-accessible memory which, at unmap time, will get copied into the actual buffer. To permit such freedoms, mapping and unmapping a buffer is something you have to do semi-frequently.

Furthermore, pre-buffer_storage, it is 100% legal to call glBufferData (note the lack of Sub) on that buffer again. This will cause the old data storage to be discarded and new storage allocated. So you couldn't keep a buffer mapped if you tried to reallocate its storage.

Also, implementations couldn't really know where to allocate storage for GPUs with multiple kinds of memory, so they would tend to move the buffer's allocation around based on how you use it. If you frequently upload to it, they'll eventually put it into CPU accessible memory to speed up such operations. And so forth. But this is all done after allocating storage, and it's done based on how you use the buffer. So having the buffer be mapped would prevent such shuffling around of data.

Making a buffer's storage immutable solves these problems. You cannot reallocate an immutable buffer's storage, so there's no problem with keeping it mapped forever. glBufferStorage takes usage flags which unlike glBufferData's hints, explicitly forbid you from using the memory in certain ways. So the implementation knows exactly where to put the storage and will not need to move it around after the fact. So there's no problem with persistently mapped storage.

And I imagine that, for a console emulator, having persistently mapped storage would be very useful. After all, programs written for such consoles touch memory directly at any time, the same memory that the GPU can address. So storing it in a buffer and keeping that memory mapped all the time probably works out better than uploading pieces of it, or trying to figure out which pieces should be uploaded, or whatever.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.