CUDPP
2.2
CUDA Data-Parallel Primitives Library
|
Reduce functionality header file - contains CUDPP interface (not public) More...
Functions | |
void | allocReduceStorage (CUDPPReducePlan *plan) |
Allocate intermediate arrays used by reductions. More... | |
void | freeReduceStorage (CUDPPReducePlan *plan) |
Deallocate intermediate block sums arrays in a CUDPPReducePlan object. More... | |
void | cudppReduceDispatch (void *d_out, const void *d_in, size_t numElements, const CUDPPReducePlan *plan) |
Dispatch function to perform a parallel reduction on an array with the specified configuration. More... | |
Reduce functionality header file - contains CUDPP interface (not public)
void allocReduceStorage | ( | CUDPPReducePlan * | plan | ) |
Allocate intermediate arrays used by reductions.
Reductions of large arrays must be split into multiple blocks, where each block is reduced by a single CUDA thread block. Each block writes its partial sum to global memory where it is reduced to a single element in a second pass.
[in,out] | plan | Pointer to CUDPPReducePlan object containing options and number of elements, which is used to compute storage requirements, and within which intermediate storage is allocated. |
void freeReduceStorage | ( | CUDPPReducePlan * | plan | ) |
Deallocate intermediate block sums arrays in a CUDPPReducePlan object.
These arrays must have been allocated by allocScanStorage(), which is called by the constructor of cudppReducePlan().
[in,out] | plan | Pointer to CUDPPReducePlan object initialized by allocScanStorage(). |
void cudppReduceDispatch | ( | void * | d_odata, |
const void * | d_idata, | ||
size_t | numElements, | ||
const CUDPPReducePlan * | plan | ||
) |
Dispatch function to perform a parallel reduction on an array with the specified configuration.
This is the dispatch routine which calls reduceArray() with appropriate template parameters and arguments to achieve the scan as specified in plan.
[out] | d_odata | The output array of scan results |
[in] | d_idata | The input array |
[in] | numElements | The number of elements to scan |
[in] | plan | Pointer to CUDPPReducePlan object containing reduce options and intermediate storage |