0
$\begingroup$

VR rendering needs a lot of GPU power. VR SLI can help that. But, is it possible for us to use distributed parallel rendering technology to improve the performance dramatically? There is an open source software "equalizer" which could do that.

For example, put 4 or 8 GPUs in one PC server and disable vender's SLI driver. Use equalizer to do parallel rendering. Or, run equalizer on 16 PC servers with GPU equipped.

So that we can get very high performance easily? Right?

Equalizer supports sort-first and sort-last. Since sort-last needs a lot of network bandwidth, we'd better use sort-first.

$\endgroup$
5
  • 1
    $\begingroup$ Could you explain the difference in what you are looking for here, compared to your previous question? Is this a different approach? $\endgroup$ Commented Aug 1, 2016 at 13:27
  • $\begingroup$ @trichoplax, thanks for your comment. It is same. I'm trying to implement VR in Cloud rendering. But I have to decide which way. Rendering in single server would be easier than distributed parallel rendering, but there are still 2 ways to achieve that: rely on SLI driver or use equalizer. It isn't flexible to do all things in one server, in terms of resource sharing and hardware sourcing. If we can achieve distributed parallel rendering, maybe we can choose PowerVR or Mali GPU, not NVIDIA/AMD. I know concerns on network latency, which we have solutions to handle. So here I don't discuss that. $\endgroup$
    – Hao Zhang
    Commented Aug 3, 2016 at 6:10
  • $\begingroup$ I get the feeling what you want/need is an open minded discussion, not a definite answer. $\endgroup$
    – Andreas
    Commented Aug 5, 2016 at 17:13
  • $\begingroup$ Keep in mind that transfer of information between computers isn't free. For instance, even getting data from RAM to the GPU is a pretty costly operation, let alone from another machine's RAM, even on the local network. Transferring data between GPUs is also not free. VR requires very low latency frames (not just high frame rate), and what you are proposing sounds like it would add quite a bit of latency and may not be practical due to that. $\endgroup$
    – Alan Wolfe
    Commented Aug 5, 2016 at 18:09
  • $\begingroup$ Ok, get it. We still want to try. At last, we have to try multi gpus in one server. Thanks. $\endgroup$
    – Hao Zhang
    Commented Aug 8, 2016 at 0:41

1 Answer 1

5
$\begingroup$

You are essentially correct in assuming that it is technically feasible to distribute the rendering computation workload. It is, essentially, a computation workload like any other; possibly even better-suited for parallelization due to its very nature: many similarly structured units running the same code path (i.e. vertices, pixels).

What you are forgetting, however, is latency. Having all the render power in the world is useless if the image does not arrive to the HMD fast enough not to cause discomfort to the user. Here is an article that explains the importance of motion-to-photon latency:

http://www.chioka.in/what-is-motion-to-photon-latency/

$\endgroup$
4
  • $\begingroup$ thanks for your answer. I have read this article before. Do you mean distributed parallel rendering will introduce much latency? I just want to lower down the rendering latency. $\endgroup$
    – Hao Zhang
    Commented Aug 3, 2016 at 6:18
  • $\begingroup$ Yes. Distributed rendering has the inherent cost of integrating results from all the distributed nodes. In this case, I imagine serializing a scene, sending it out to computation nodes and then collecting parts of the resulting image (frame buffer tiles probably?). It seems unlikely to me that at current I/O speeds you can fit all of that + the rendering itself under ~10 ms. What I would expect from such a scenario is that you could probably get silky-smooth, high-performance rendering, but lagging a consistent dozen frames behind the player input, therefore somewhat defeating the purpose. $\endgroup$
    – IneQuation
    Commented Aug 3, 2016 at 10:29
  • $\begingroup$ Sort-last may have this issue as what you said, but how about sort-first? Use 2d of sort-first, it seems that few interactions are introduced between rendering worker. According to Equalizer evaluation result, it consumes little network bandwidth: eyescale.github.io/equalizergraphics.com/scalability/2D.html. Thanks $\endgroup$
    – Hao Zhang
    Commented Aug 4, 2016 at 0:29
  • $\begingroup$ But I never even mentioned interaction between workers. :) It's entirely possible to eliminate it altogether. By synchronisation I meant dispatching the workload to nodes and then compositing the results together. Also, "little" network bandwidth is a pretty imprecise and relative term – I expect it not to be "little" on the scale that VR applications operate in. It's quite simple, really – if you can make a round trip within 10 ms, you're fine. Experience simply tells me that you can't. You are welcome to prove me wrong. :) $\endgroup$
    – IneQuation
    Commented Aug 4, 2016 at 7:19

Not the answer you're looking for? Browse other questions tagged or ask your own question.