Shared memory declaration struct for templatized types. More...

Classes
struct	SharedMemory< T >
	Wrapper class for templatized dynamic shared memory arrays. More...

Detailed Description

Shared memory declaration struct for templatized types.

sharedmem.h

Because dynamically sized shared memory arrays are declared "extern" in CUDA, we can't templatize their types directly. To get around this, we declare a simple wrapper struct that will declare the extern array with a different name depending on the type. This avoids linker errors about multiple definitions.

To use dynamically allocated shared memory in a templatized global or device function, just replace code like this:

 template<class T>
 __global__ void
 foo( T* d_out, T* d_in) 
 {
     // Shared mem size is determined by the host app at run time
     extern __shared__  T sdata[];
     ...
     doStuff(sdata);
     ...
 }

With this

 template<class T>
 global void
 foo( T* d_out, T* d_in) 
 {
     // Shared mem size is determined by the host app at run time
     SharedMemory<T> smem;
     T* sdata = smem.getPointer();
     ...
     doStuff(sdata);
     ...
 }

Classes

Detailed Description