Questions tagged [opencl]
OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.
opencl
5,750
questions
0
votes
0
answers
45
views
OpenCL : high fix cost to run gpu instructions, what to do?
I'm using OpenCL in c++. My GPU is NVIDIA GeForce RTX 3070.
I have a very simple kernel
__kernel void op_exp_f(global float* vOut)
{
const uint i = get_global_id(0);
vOut[i] = exp(vOut[i]);
}
...
0
votes
0
answers
22
views
How to use hashcat opencl sha256_update function if buf is larger than 64 bytes?
I'm trying to change the logic of one of the modules, adding sha256 instead of other hashing.
The problem is that the data for the hash exceeds the buffer limit of 64 bytes.
I think there should be ...
1
vote
1
answer
69
views
Register usage for arrays in CUDA
How do arrays map to GPU registers in NVIDIA GPUs (and in AMD GPUs when using OpenCL) ?
So when I define an array like uint8_t X[64], is it stored in 16 32-bit registers (4 bytes per register) or in ...
0
votes
0
answers
38
views
installing/building PyOpenCL to Mali GPU causes linux kernel panic: Unable to handle kernel NULL pointer dereference
I'm trying to install PyOpenCL on a Mali G78 GPU, with the environment set to Ubuntu 22.04 (aarch64). However, I keep encountering kernel panics.
Does anyone know the reason why this is happening?
pip ...
0
votes
2
answers
125
views
How to use 128bit float and complex numbers in OpenCL/CUDA?
I need to use 128 bit floating point numbers and complex numbers in parallel GPU computing using OpenCL or CUDA.
Are there any ways to achieve this without implementing it yourself?
I looked at the ...
0
votes
1
answer
55
views
How the struct 'param_traits' is defined in OpenCL?
In OpenCL's C++ header file, line 1611, there is code as below:
template <typename enum_type, cl_int Name>
struct param_traits {};
#define CL_HPP_DECLARE_PARAM_TRAITS_(token, param_name, T) \
...
1
vote
0
answers
74
views
OpenCL flush absurdly slow, seemingly triggered by clEnqueueReleaseGLObjects in OpenCL/OpenGL interop
I'm writing an interactive application which uses OpenCL 1.2 to render each frame and which uses OpenCL-OpenGL interop to copy the frame to an OpenGL texture which is finally rendered via OpenGL. The ...
2
votes
1
answer
92
views
Arrayfire build has issue with OpenCL turned OFF
I'm getting an error with building arrayfire.
I wish to use Arrayfire purely with CUDA, yet the arrayfire build seems to require OpenCL, despite me setting the flag for OpenCL not to be built.
Build ...
0
votes
0
answers
69
views
Prefix hash array: understanding OpenCL slowdown
I'm trying to speed up building a hash prefix array with OpenCL 3.0.
Starting with v = [v[0], v[1], v[2], ... ], I want to build:
pref = [v[0], B * v[0] + v[1], B**2 * v[0] + B * v[1] + v[2], ... ].
I'...
2
votes
1
answer
48
views
OpenCL - How to suppress build errors from going to Standard Error?
In my application, I have a single OpenCL C program that gets built with several swappable modules that change parts of the code—notably, it changes some macros so that different arithmetic types are ...
-2
votes
1
answer
70
views
GPU Thread Management [duplicate]
I'm currently working on a ray tracing algorithm for a non-image rendering application, utilizing both CUDA and OpenCL for GPU acceleration. My algorithm processes more than 1 million rays, and I'm ...
0
votes
0
answers
16
views
Derived class as argument of OpenCL kernel function
I am trying to compile the following OpenCL kernel code with clang-16:
clang-16 -cl-std=clc++2021 -c -emit-llvm -target spir64 -Xclang -finclude-default-header \
-o /workspace/test/...
1
vote
1
answer
41
views
How OpenCL set up the memory buffer between the Integrated dev and CPU cores?
The external device usually has their own separate memory, which requires the DMA memory region between the device and CPU to copy data from/to system DRAM to devices's internal DRAM.
Therefore, I ...
1
vote
1
answer
99
views
OpenCL 1.2: Global memory consistency surrounding atomic operations?
I'm trying to implement global synchronization in OpenCL 1.2 using atomics and was wondering if there's any way to ensure that reads from different work groups (that provably -- by the logic of the ...
0
votes
0
answers
34
views
how to pass a gbm memory directly to OpenCL kernel function
I have a camera driver (for Linux) which can write the raw data directly into gbm gpu memory by specifying the fd (fd is obtained using gbm_bo_create & gbm_bo_get_fd).
And I want my OpenCL kernel ...