CUDPP 1.1.1
CUDPP Documentation

Introduction

CUDPP is the CUDA Data Parallel Primitives Library. CUDPP is a library of data-parallel algorithm primitives such as parallel-prefix-sum ("scan"), parallel sort and parallel reduction. Primitives such as these are important building blocks for a wide variety of data-parallel algorithms, including sorting, stream compaction, and building data structures such as trees and summed-area tables.

Overview Presentation

A brief set of slides that describe the features, design principles, applications and impact of CUDPP is available here: CUDPP Presentation.

Homepage

Homepage for CUDPP: http://code.google.com/p/cudpp

Announcements and discussion of CUDPP are hosted on the CUDPP Google Group.

Getting Started with CUDPP

You may want to start by browsing the CUDPP Public Interface. For information on building CUDPP, see Building CUDPP.

The "apps" subdirectory included with CUDPP has a few source code samples that use CUDPP:

We have also provided a code walkthrough of the simpleCUDPP example.

Getting Help and Reporting Problems

To get help using CUDPP, please use the CUDPP Google Group.

To report CUDPP bugs or request features, you may use either the above CUDPP Google Group, or you can file an issue directly using Google Code.

Release Notes

For specific release details see the Change Log.

This release (1.1.1) is a bugfix release to CUDPP 1.1 that includes fixes to support CUDA 3.0 and the new NVIDIA Fermi architecture, including GeForce 400 series and Tesla 20 series GPUs. It also has bug fixes for 64-bit OSes.

Operating System Support

This release (1.1.1) has been thoroughly tested on the following OSes.

We expect CUDPP to build and run correctly on other flavors of Linux and Windows, but these are not actively tested by the developers at this time.

Notes: CUDPP is not compatible with CUDA 2.1. A compiler bug in 2.1 causes the compiler to crash. Also, starting with CUDPP 1.1.1, we are no longer testing CUDA device emulation, because it is deprecated in CUDA 3.0 and will be removed from future CUDA versions.

CUDA

CUDPP is implemented in CUDA C/C++. It requires the CUDA Toolkit version 2.2 or later. Please see the NVIDIA CUDA homepage to download CUDA as well as the CUDA Programming Guide and CUDA SDK, which includes many CUDA code examples. Some of the samples in the CUDA SDK (including "marchingCubes", "lineOfSight", and radixSort) also use CUDPP.

Design Goals

Design goals for CUDPP include:

Programmers may use any of the lower three CUDPP layers in their own programs by building the source directly into their application. However, the typical usage of CUDPP is to link to the library and invoke functions in the CUDPP Public Interface, as in the simpleCUDPP, satGL, and cudpp_testrig application examples included in the CUDPP distribution.

In the future, if and when CUDA supports building device-level libraries, we hope to enhance CUDPP to ease the use of CUDPP internal algorithms at all levels.

Use Cases

We expect the normal use of CUDPP will be in one of two ways:

  1. Linking the CUDPP library against another application.
  2. Running our "test" application, cudpp_testrig, that exercises CUDPP functionality.

References

The following publications describe work incorporated in CUDPP.

Many researchers are using CUDPP in their work, and there are many publications that have used it (references). If your work uses CUDPP, please let us know by sending us a reference (preferably in BibTeX format) to your work.

Citing CUDPP

If you make use of CUDPP primitives in your work and want to cite CUDPP (thanks!), we would prefer for you to cite the appropriate papers above, since they form the core of CUDPP. To be more specific, the GPU Gems paper describes (unsegmented) scan, multi-scan for summed-area tables, and stream compaction. The NVIDIA technical report describes the current scan and segmented scan algorithms used in the library, and the Graphics Hardware paper describes an earlier implementation of segmented scan, quicksort, and sparse matrix-vector multiply. The IPDPS paper describes the radix sort used in CUDPP, and the I3D paper describes the random number generation algorithm.

Credits

CUDPP Developers

Other CUDPP Contributors

Acknowledgments

Thanks to Jim Ahrens, Timo Aila, Nathan Bell, Ian Buck, Guy Blelloch, Jeff Bolz, Michael Garland, Jeff Inman, Eric Lengyel, Samuli Laine, David Luebke, Pat McCormick, and Richard Vuduc for their contributions during the development of this library.

CUDPP Developers from UC Davis thank their funding agencies:

CUDPP Copyright and Software License

CUDPP is copyright The Regents of the University of California, Davis campus and NVIDIA Corporation. The library, examples, and all source code are released under the BSD license, designed to encourage reuse of this software in other projects, both commercial and non-commercial. For details, please see the CUDPP License page.

Note that prior to release 1.1 of CUDPP, the license used was a modified BSD license. With release 1.1, this license was replaced with the pure BSD license to facilitate the use of open source hosting of the code.

 All Classes Files Functions Variables Enumerations Enumerator Defines