6
$\begingroup$

Source

In this webpage from Nvidia, the author(s) seems to imply that you could create a command buffer for each eye on separate threads. However, I don't see the benefit to this over instanced stereo rendering since you must also handle synchronization with multiple command buffers.

Possible Synchronization Issues

For synchronization (feel free to correct me if I'm wrong), I was thinking about when using a framebuffer for each eye, kind of the "naive" way of rendering for VR. I suppose though you could just render to one framebuffer like with stereo instanced rendering. Regardless, the issue with synchronization that I'm concerned about would be if one eye finishes considerably faster than the other eye. It seems that with stereo instanced rendering, this would be less likely to happen since the workload is more divisible. The other issue I see where synchronization could cause a performance hit is that you must wait for both threads to complete sending their respective command buffer before sending it again. Arguably, this might not be a huge concern, but it seems unnecessary nonetheless.

Spatial Locality Issues

It also seems that since there isn't a way to guarantee when an object is going through the pipeline that you may get additional cache misses on the GPU compared to instancing here.

Question

Is there any advantage to using multiple command buffers for stereo rendering compared to instanced stereo rendering?

$\endgroup$
2
  • $\begingroup$ "since you must also handle synchronization with multiple command buffers" What synchronization, in particular, are you referring to? $\endgroup$ Commented Nov 10, 2016 at 20:21
  • $\begingroup$ I edited the question to be more specific. Basically, there are two types of synchronization that I'm concerned about, one with the framebuffers, and one with command buffer submission/frequency. $\endgroup$
    – aces
    Commented Nov 11, 2016 at 2:50

1 Answer 1

4
$\begingroup$

Yes, there are advantages. You can render different command buffers to different queues.

Modern NVIDIA hardware, in particular, offers a full sixteen separate queues that are capable of rendering independent graphics operations. By putting each eye in a separate queue, you may be able to use the available hardware resources more efficiently.

Then again, it may not. It all really depends on how things work out down in the lowest levels of hardware.


Regardless, the issue with synchronization that I'm concerned about would be if one eye finishes considerably faster than the other eye. It seems that with stereo instanced rendering, this would be less likely to happen since the workload is more divisible.

It would basically never happen with instanced rendering. But you say that like it's a good thing.

If one eye happens to have more scene complexity than the other eye, then that means instanced stereo rendering will be wasting a lot of vertex processing on vertices for the "weak" eye that will never be seen. Whereas if you built the command buffers separately, you would only submit the stuff that matters for both scenes.

If you submit both CBs to the same queue, the total time spent processing those CBs can be less than the time spent processing the larger CB, since it isn't computing a bunch of triangles that won't be culled. If you submit the two CBs to different queues, how that balances out depends entirely on how well the hardware can adapt to the workload it is given. If it can adapt well, then you'll be fine.

So for this particular case, the multiple CB setup may represent a more efficient use of hardware resources. Granted, this particular case is rather rare.

Synchronization for this case is really not a problem. If one eye is so complicated relative to the other that you drop a frame, then it doesn't matter which form you're using to render. You're going to drop a frame.

With Vulkan, rendering entirely separate stuff to multiple queues isn't difficult. Indeed, the only real synchronization you might need would be to synchronize the presentation of the swapchain images, which is easily done with semaphores.

The other issue I see where synchronization could cause a performance hit is that you must wait for both threads to complete sending their respective command buffer before sending it again.

No, you don't. If you're reusing CBs, you can always use the VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT, which allows you to submit a CB that's already in use. And if you're not reusing CBs (that is, your CBs aren't static), you don't care; just double-buffer your command pools so that you aren't trying to reset a CB that's being consumed.

The latter is the general standard method for rendering dynamic scenes with Vulkan.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.