CUDPP
2.3
CUDA Data-Parallel Primitives Library
|
Scan functionality header file - contains CUDPP interface (not public) More...
Functions | |
void | allocSegmentedScanStorage (CUDPPSegmentedScanPlan *plan) |
Allocate intermediate block sums, block flags and block indices arrays in a CUDPPSegmentedScanPlan class. More... | |
void | freeSegmentedScanStorage (CUDPPSegmentedScanPlan *plan) |
Deallocate intermediate block sums, block flags and block indices arrays in a CUDPPSegmentedScanPlan class. More... | |
Scan functionality header file - contains CUDPP interface (not public)
void allocSegmentedScanStorage | ( | CUDPPSegmentedScanPlan * | plan | ) |
Allocate intermediate block sums, block flags and block indices arrays in a CUDPPSegmentedScanPlan class.
Segmented scans of large arrays must be split (possibly recursively) into a hierarchy of block segmented scans, where each block is scanned by a single CUDA thread block. At each recursive level of the scan, we need an array in which to store the total sums of all blocks in that level. Also at this level we have two more arrays - one which contains the OR-reductions of flags of all blocks at that level and the second which contains the min-reductions of indices of all blocks at that levels This function computes the amount of storage needed and allocates it.
[in] | plan | Pointer to CUDPPSegmentedScanPlan object containing segmented scan options and number of elements, which is used to compute storage requirements. |
void freeSegmentedScanStorage | ( | CUDPPSegmentedScanPlan * | plan | ) |
Deallocate intermediate block sums, block flags and block indices arrays in a CUDPPSegmentedScanPlan class.
These arrays must have been allocated by allocSegmentedScanStorage(), which is called by the constructor of CUDPPSegmentedScanPlan.
[in] | plan | CUDPPSegmentedScanPlan class initialized by its constructor. |