CUDA - what is this loop doing -


itemprop = "text">

Hey I have seen on a website this example kernel

  __global__ zero loop1 (integer n , Float alpha, float * x, float * y) {int i; Int I0 = blockIdx.x * blockDim.x + threadIdx.x; For (i = i0; i & lt; N; i + = blockDim.x * gridDim.x) {or [i] = alpha * x [i] + y [i]; }}   

To calculate this function in C

 for  (i = 0; i & lt; n; i ++) { Y [i] = alpha * x [ii] + y [ii]; }   

Certainly not necessary for loop inside the kernel? And you can just do y [i0] = alpha * x [i0] + y [i0] and completely remove it for loop.

Why I'm just curious as it is there and what is its purpose it is assuming a kernel call such as loop1 & lt; & Lt; & Lt; 64,256 & gt; & Gt; & Gt; This may be possibly gridDim.x = 1

Loop is required if there are more entries in your vector starting with your threads, if possible, it is definitely more efficient to start a sufficient thread.

Comments