Skip to main content

Questions tagged [opencl]

OpenCL (Open Computing Language) is a framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.

0 votes
0 answers
45 views

OpenCL : high fix cost to run gpu instructions, what to do?

I'm using OpenCL in c++. My GPU is NVIDIA GeForce RTX 3070. I have a very simple kernel __kernel void op_exp_f(global float* vOut) { const uint i = get_global_id(0); vOut[i] = exp(vOut[i]); } ...
Joseph Perez's user avatar
0 votes
0 answers
22 views

How to use hashcat opencl sha256_update function if buf is larger than 64 bytes?

I'm trying to change the logic of one of the modules, adding sha256 instead of other hashing. The problem is that the data for the hash exceeds the buffer limit of 64 bytes. I think there should be ...
First Second's user avatar
1 vote
1 answer
69 views

Register usage for arrays in CUDA

How do arrays map to GPU registers in NVIDIA GPUs (and in AMD GPUs when using OpenCL) ? So when I define an array like uint8_t X[64], is it stored in 16 32-bit registers (4 bytes per register) or in ...
Eugene  Krasnikov's user avatar
0 votes
0 answers
38 views

installing/building PyOpenCL to Mali GPU causes linux kernel panic: Unable to handle kernel NULL pointer dereference

I'm trying to install PyOpenCL on a Mali G78 GPU, with the environment set to Ubuntu 22.04 (aarch64). However, I keep encountering kernel panics. Does anyone know the reason why this is happening? pip ...
Bluewings's user avatar
0 votes
2 answers
125 views

How to use 128bit float and complex numbers in OpenCL/CUDA?

I need to use 128 bit floating point numbers and complex numbers in parallel GPU computing using OpenCL or CUDA. Are there any ways to achieve this without implementing it yourself? I looked at the ...
German's user avatar
  • 323
0 votes
1 answer
55 views

How the struct 'param_traits' is defined in OpenCL?

In OpenCL's C++ header file, line 1611, there is code as below: template <typename enum_type, cl_int Name> struct param_traits {}; #define CL_HPP_DECLARE_PARAM_TRAITS_(token, param_name, T) \ ...
diverger's user avatar
  • 176
1 vote
0 answers
74 views

OpenCL flush absurdly slow, seemingly triggered by clEnqueueReleaseGLObjects in OpenCL/OpenGL interop

I'm writing an interactive application which uses OpenCL 1.2 to render each frame and which uses OpenCL-OpenGL interop to copy the frame to an OpenGL texture which is finally rendered via OpenGL. The ...
Danimator's user avatar
2 votes
1 answer
92 views

Arrayfire build has issue with OpenCL turned OFF

I'm getting an error with building arrayfire. I wish to use Arrayfire purely with CUDA, yet the arrayfire build seems to require OpenCL, despite me setting the flag for OpenCL not to be built. Build ...
Hugo Phibbs's user avatar
0 votes
0 answers
69 views

Prefix hash array: understanding OpenCL slowdown

I'm trying to speed up building a hash prefix array with OpenCL 3.0. Starting with v = [v[0], v[1], v[2], ... ], I want to build: pref = [v[0], B * v[0] + v[1], B**2 * v[0] + B * v[1] + v[2], ... ]. I'...
catalyst's user avatar
2 votes
1 answer
48 views

OpenCL - How to suppress build errors from going to Standard Error?

In my application, I have a single OpenCL C program that gets built with several swappable modules that change parts of the code—notably, it changes some macros so that different arithmetic types are ...
Xirema's user avatar
  • 20.2k
-2 votes
1 answer
70 views

GPU Thread Management [duplicate]

I'm currently working on a ray tracing algorithm for a non-image rendering application, utilizing both CUDA and OpenCL for GPU acceleration. My algorithm processes more than 1 million rays, and I'm ...
berk2609's user avatar
0 votes
0 answers
16 views

Derived class as argument of OpenCL kernel function

I am trying to compile the following OpenCL kernel code with clang-16: clang-16 -cl-std=clc++2021 -c -emit-llvm -target spir64 -Xclang -finclude-default-header \ -o /workspace/test/...
Jackdu0049's user avatar
1 vote
1 answer
41 views

How OpenCL set up the memory buffer between the Integrated dev and CPU cores?

The external device usually has their own separate memory, which requires the DMA memory region between the device and CPU to copy data from/to system DRAM to devices's internal DRAM. Therefore, I ...
ruach's user avatar
  • 1,439
1 vote
1 answer
99 views

OpenCL 1.2: Global memory consistency surrounding atomic operations?

I'm trying to implement global synchronization in OpenCL 1.2 using atomics and was wondering if there's any way to ensure that reads from different work groups (that provably -- by the logic of the ...
Danimator's user avatar
0 votes
0 answers
34 views

how to pass a gbm memory directly to OpenCL kernel function

I have a camera driver (for Linux) which can write the raw data directly into gbm gpu memory by specifying the fd (fd is obtained using gbm_bo_create & gbm_bo_get_fd). And I want my OpenCL kernel ...
u0804138's user avatar

15 30 50 per page
1
2 3 4 5
384