Questions tagged [gpu]
The gpu tag has no usage guidance.
27
questions
0
votes
1
answer
27
views
How do I decide the specifications for a compute cluster that I need for a fine-tuning task in Azure ML?
I was using the Standard_NC48ads_A100_v4 Compute cluster with 2 nodes to fine-tune a Phi-3-mini-128k-instruct model. The data used for the same was around 25 MB. I set the training batch size to 10. ...
3
votes
0
answers
136
views
How to package and distribute a Tensorflow GPU desktop application
I am developing a desktop application that utilises Tensorflow. The aim of the application is to let users easily train a given model and use it for inference within the app. I want to support ...
3
votes
3
answers
830
views
How does cuRAND use a GPU to accelerate random number generation? Don't those require a state?
My understanding is that every PRNG or QRNG requires a state to prevent the next item in its sequence from being too predictable; which is sensible, as they're all running on deterministic hardware.
...
2
votes
1
answer
195
views
Docker and GPU-based computations. Feasible? [closed]
Recently I ran against this question and this Nvidia-docker project, which is an Nvidia Docker implementation and it made me wondering where, why and how this scheme makes sense?
I found out some ...
1
vote
1
answer
163
views
What tools exist to determine the speed up a GPU will have on an algorithm?
Basically, I am wondering what sort of speed I will get by parallelizing a algorithm to work with GPUs. I am wondering if someone has implemented queueing theory/Amdahl's law with a UI or if everyone ...
6
votes
1
answer
4k
views
Definition and usage of "warp" in parallel / GPU programming
I have come across the word "warp" in a few places but haven't seen a thorough definition (there's no Wikipedia page on it either).
A brief definition is found here:
In the SIMT paradigm, threads ...
0
votes
1
answer
431
views
Example of parallel sum algorithm on GPU
Trying to imagine how you would go about implementing summation (or reduction?) on a parallel architecture and am having a difficult time.
Specifically thinking in terms of WebGL arrays of vectors ...
4
votes
3
answers
2k
views
Parallel processing a Tree on GPU
I have seen a few papers on parallel/GPU processing of trees, but after briefly looking through them I wasn't able to grasp what they did. The closest to a helpful explanation was found in ...
0
votes
1
answer
2k
views
Feature of CPU needed to run Javascript fast
This is more of a Computer Engineering question, but what is the feature of a CPU to run Javascript fast? I use to access the internet with an AMD Phenom II with 6 cores and I could almost have as ...
7
votes
0
answers
194
views
Incorporating existing 2D OpenCL/OpenGL application in 3D scene
There is an existing real-time, scientific visualization application that uses OpenCL and OpenGL to render complex 2D graphs. My goal is to incorporate this application into a 3D rendered scene. At ...
2
votes
1
answer
202
views
Large Scale Machine Learning vs Traditional HPC Hardware
I've spent the last few days working with tensorflow for the first time as part of a natural language processing assignment for my degree. It's been interesting (fun isn't the right word) trying to ...
43
votes
7
answers
10k
views
In software programming, would it be possible to have both CPU and GPU loads at 100%?
This is a general question on a subject I've found interesting as a gamer: CPU/GPU bottlenecks and programming. If I'm not mistaken, I've come to understand that both CPU and GPU calculate stuff, but ...
1
vote
1
answer
443
views
Does this workload fit GPU's?
I have a perfectly parallel function that would run great on a machine with 1024 cores and 4GB RAM. There's quite a lot of branching (doing set union and traversing structs). There is no communication ...
8
votes
1
answer
3k
views
Is hardware accelerated GUI data kept on the GPU
I am doing some research as to how most hardware accelerated GUI libraries work. I am actually only care about the rendering backends of them here. I am trying to figure out what would be the best way ...
-1
votes
1
answer
336
views
Linking kernel voids without CPU parse (Compute shaders)
Is it possible to parse data between compute shader voids without having to create a new buffer and cpu link (Using unity with C# interface).
For example I have a kernel with position data on a set ...