CUDPP
2.3
CUDA Data-Parallel Primitives Library
|
Shared memory declaration struct for templatized types. More...
Classes | |
struct | SharedMemory< T > |
Wrapper class for templatized dynamic shared memory arrays. More... | |
Shared memory declaration struct for templatized types.
Because dynamically sized shared memory arrays are declared "extern" in CUDA, we can't templatize their types directly. To get around this, we declare a simple wrapper struct that will declare the extern array with a different name depending on the type. This avoids linker errors about multiple definitions.
To use dynamically allocated shared memory in a templatized global or device function, just replace code like this:
template<class T> __global__ void foo( T* d_out, T* d_in) { // Shared mem size is determined by the host app at run time extern __shared__ T sdata[]; ... doStuff(sdata); ... }
With this
template<class T> global void foo( T* d_out, T* d_in) { // Shared mem size is determined by the host app at run time SharedMemory<T> smem; T* sdata = smem.getPointer(); ... doStuff(sdata); ... }