CUDPP  2.3
CUDA Data-Parallel Primitives Library
Todo List
Member allocReduceStorage (CUDPPReducePlan *plan)
should this flag an error?
Class CudaHT::CuckooHashing::HashTable

Templatize the interface without forcing the header file to have CUDA calls.

Member CudaHT::CuckooHashing::MultivalueHashTable::Retrieve (const unsigned, const unsigned *, unsigned *)
Remove this function entirely somehow.
Member CUDPP_OPTION_CTA_LOCAL
Currently ignored.
Member cudppCompact (const CUDPPHandle planHandle, void *d_out, size_t *d_numValidElements, const void *d_in, const unsigned int *d_isValid, size_t numElements)
[MJH] We need to evaluate whether cudppCompact should be a core member of the public interface. It's not clear to me that what the user always wants is a final compacted array. Often one just wants the array of indices to which each input element should go in the output. The split() routine used in radix sort might make more sense to expose.
Member CUDPPCompactPlan::CUDPPCompactPlan (CUDPPManager *mgr, CUDPPConfiguration config, size_t numElements, size_t numRows, size_t rowPitch)
Add support for multirow compaction
Member cudppMergeSort (const CUDPPHandle planHandle, void *d_keys, void *d_values, size_t numElements)
Determine if we need to provide an "out of place" sort interface.
Member cudppRadixSort (const CUDPPHandle planHandle, void *d_keys, void *d_values, size_t numElements)
Determine if we need to provide an "out of place" sort interface.
Member cudppRand (const CUDPPHandle planHandle, void *d_out, size_t numElements)
Currently only MD5 PRNG is supported. We may provide more rand routines in the future.
Member cudppStringSortAligned (const CUDPPHandle planHandle, unsigned int *d_keys, unsigned int *d_values, unsigned int *stringVals, size_t numElements, size_t stringArrayLength)
Determine if we need to provide an "out of place" sort interface.
Member DISALLOW_LOADSTORE_OVERLAP
Parameterize this in case this perf detail changes on future GPUs.
File hash_multivalue.h
Figure out why there are still issues when running under Windows.
Member launchRandMD5Kernel (unsigned int *d_out, unsigned int seed, size_t numElements)
: chose a better block size, perhaps a multiple of two is optimal
Member rank4 (uint4 preds)
is the description of "preds" correct?
Member reorderData (uint *outKeys, uint *outValues, uint2 *keys, uint2 *values, uint *blockOffsets, uint *offsets, uint *sizes, uint numElements, uint totalBlocks)
Args that are const below should be prototyped as const
Member vectorAddConstant (T *d_vector, T constant, int n, int baseIndex)
Test this function – it is currently not yet used.
Member vectorAddVector (T *d_vectorA, const T *d_vectorB, int numElements, int baseIndex)
Test this function – it is currently not yet used.