CUDPP  2.3
CUDA Data-Parallel Primitives Library
Publications that use CUDPP

Daniele D'Agostino and Frank J Seinstra. A parallel isosurface extraction component for visualization pipelines executing on GPU clusters. Journal of Computational and Applied Mathematics, 273:383–393, January 2015. [ bib | DOI ]

Ezio Bartocci, Richard DeFrancisco, and Scott A. Smolka. Towards a GPGPU-Parallel SPIN model checker. In Proceedings of the 2014 International SPIN Symposium on Model Checking of Software, pages 87–96. ACM, July 2014. [ bib ]

Apostolos Glenis and Sergios Petridis. Performance and energy characterization of high-performance low-cost cornerness detection on GPUs and multicores. In International Conference on Information, Intelligence, Systems and Applications, IISA 2014, pages 181–186. IEEE, July 2014. [ bib | DOI ]

Win-Tsung Lo, Yue-Shan Chang, Ruey-Kai Sheu, Chun-Chieh Chiu, and Shyan-Ming Yuan. CUDT: A CUDA Based Decision Tree Algorithm. The Scientific World Journal, July 2014. [ bib | DOI ]

Tsz Ho Wong, Geoff Leach, and Fabio Zambetta. An adaptive octree grid for GPU-based collision detection of deformable objects. The Visual Computer, 30(6–8):729–738, May 2014. [ bib | DOI ]

Anna Fabijańska and Jaroslaw Goclawski. New accelerated graph-based method of image segmentation applying minimum spanning tree. IET Image Processing, 8(4), April 2014. [ bib | DOI ]

Junjie Chen, Xiaogang Jin, and Zhigang Deng. GPU-based polygonization and optimization for implicit surfaces. The Visual Computer, pages 1–12, March 2014. [ bib | DOI ]

Tomislav Matić, Ivan Aleksi, and Željko Hocenski. CPU, GPU and FPGA Implementations of MALD: Ceramic Tile Surface Defects Detection Algorithm. AUTOMATIKA: časopis za automatiku, mjerenje, elektroniku, računarstvo i komunikacije, 55(1):9–21, January 2014. [ bib ]

Atle Riise and Edmund K Burke. On parallel local search for permutations. Journal of the Operational Research Society, 2014. [ bib | DOI ]

Fadi N Sibai and Ali El-Moursy. Performance evaluation and comparison of parallel conjugate gradient on modern multi-core accelerator and massively parallel systems. International Journal of Parallel, Emergent and Distributed Systems, 29(1):38–67, January 2014. [ bib | DOI ]

Saima Parveen and Jaya Sreevalsan-Nair. Visualization of Small World Networks Using Similarity Matrices. In Big Data Analytics, pages 151–170. Springer, December 2013. [ bib | DOI ]

Roberto Pinto Souto, Carla Osthoff, Douglas Augusto, Oswaldo Trelles, et al. Performance Evaluation of Quicksort with GPU Dynamic Parallelism for Gene-Expression Quantile Normalization. Journal of Communication and Computer, 10(12):1522–1528, December 2013. [ bib ]

Qing Dai and Xubo Yang. Interactive Smoke Simulation and Rendering on the GPU. In Proceedings of the 12th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry, VRCAI '13, pages 177–182, New York, NY, USA, November 2013. ACM. [ bib | DOI ]

Alejandro Hidalgo-Paniagua, Miguel A. Vega-Rodríguez, Nieves Pavón, and Joaquín Ferruz. A comparative study of parallel software SURF implementations. Concurrency and Computation: Practice and Experience, October 2013. [ bib | DOI ]

Byungjoon Chang, Woong Seo, and Insung Ihm. On the Efficient Implementation of a Real-time Kd-tree Construction Algorithm. In Symposium on GPU Computing and Applications, October 2013. [ bib | .pdf ]

Senhong Wang, Yan Zhao, Qiong Luo, Chao Wu, and Yang Xv. Accelerating In-memory Cross Match of Astronomical Catalogs. In IEEE 9th International Conference on eScience, pages 326–333. IEEE, October 2013. [ bib | DOI ]

Baoxue Zhao, Qiong Luo, and Chao Wu. Parallelizing Astronomical Source Extraction on the GPU. In IEEE 9th International Conference on eScience, pages 88–97. IEEE, October 2013. [ bib | DOI ]

Sérgio Dias and Abel Gomes. Triangulating molecular surfaces on multiple GPUs. In Proceedings of the 20th European MPI Users' Group Meeting, pages 181–186. ACM, September 2013. [ bib | DOI ]

Mingcen Gao, Thanh-Tung Cao, Ashwin Nanjappa, Tiow-Seng Tan, and Zhiyong Huang. gHull: A GPU algorithm for 3D convex hull. ACM Transactions on Mathematical Software (TOMS), 40(1):3, September 2013. [ bib | DOI ]

Yanwei Zhao, Qiang Qiu, Jinyun Fang, and Liang Li. Fast parallel interpolation algorithm using CUDA. In IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2013, pages 3662–3665. IEEE, July 2013. [ bib | DOI ]

Mengjuan Li, Lianyin Jia, Jinguo You, Jianqing Xi, HaiFei Qin, and Rui Zeng. Fast T-overlap query algorithms using graphics processor units and its applications in web data query. World Wide Web, pages 1–17, June 2013. [ bib | DOI ]

XiaoQiang Zhu, XueKun Guo, and XiaoGang Jin. Efficient polygonization of tree trunks modeled by convolution surfaces. Science China Information Sciences, 56(3):1–12, March 2013. [ bib | DOI ]

Ashwini A Patil and Pankaja A Shahapure. A GPU-Accelerated Framework for Image Processing and Computer Vision. International Journal of Latest Trends in Engineering and Technology, pages 115–120, 2013. [ bib ]

M. L. Sætra. Shallow Water Simulation on GPUs for Sparse Domains. In Andrea Cangiani, Ruslan L. Davidchack, Emmanuil Georgoulis, Alexander N. Gorban, Jeremy Levesley, and Michael V. Tretyakov, editors, Numerical Mathematics and Advanced Applications 2011, pages 673–680. Springer Berlin Heidelberg, 2013. [ bib | DOI ]

Xin Yang, Duan qing Xu, and Lei Zhao. Efficient data management for incoherent ray tracing. Applied Soft Computing, 13(1):1–8, January 2013. [ bib | DOI ]

Andrew Davidson, David Tarjan, Michael Garland, and John D. Owens. Efficient Parallel Merge Sort for Fixed and Variable Length Keys. In Proceedings of Innovative Parallel Computing (InPar '12), May 2012. [ bib | DOI | http ]

Ritesh A. Patel, Yao Zhang, Jason Mak, and John D. Owens. Parallel Lossless Data Compression on the GPU. In Proceedings of Innovative Parallel Computing (InPar '12), May 2012. [ bib | DOI | http ]

Ayal Stein, Eran Geva, and Jihad El-Sana. CudaHull: Fast parallel 3D convex hull on the GPU. Computers & Graphics, 36(4):265–271, March 2012. Applications of Geometry Processing. [ bib | DOI ]

Yue-Shan Chang, Ruey-Kai Sheu, Shyan-Ming Yuan, and Jyn-Jie Hsu. Scaling database performance on GPUs. Information Systems Frontiers, 14(4):909–924, 2012. [ bib | DOI ]

Jaroslaw Goclawski and Joanna Sekulska-Nalewajko. A Graph-Based Approach to the Segmentation of Images with Mould Filled Foam Matrices. Image Processing & Communications, 17(4):59–70, 2012. [ bib ]

Pawan Harish, P. J. Narayanan, Vibhav Vineet, and Suryakant Patidar. Fast Minimum Spanning Tree Computation. In Wen mei W. Hwu, editor, GPU Computing Gems Jade Edition, pages 77–88. Morgan Kaufmann, Boston, 2012. [ bib | DOI ]

Tyson J. Lipscomb, Anqi Zou, and Samuel S. Cho. Parallel Verlet Neighbor List Algorithm for GPU-Optimized MD Simulations. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, BCB '12, pages 321–328, New York, NY, USA, 2012. ACM. [ bib | DOI ]

Guilan Wang and Guoliang Zhou. GPU-Based Aggregation of On-Line Analytical Processing. In Maotai Zhao and Junpin Sha, editors, Communications and Information Processing, volume 288 of Communications in Computer and Information Science, pages 234–245. Springer Berlin Heidelberg, 2012. [ bib | DOI ]

Ming Zeng, Fukai Zhao, Jiaxiang Zheng, and Xinguo Liu. A Memory-Efficient KinectFusion Using Octree. In Shi-Min Hu and Ralph R. Martin, editors, Computational Visual Media, volume 7633 of Lecture Notes in Computer Science, pages 234–241. Springer Berlin Heidelberg, 2012. [ bib | DOI ]

Kun Zhou. GPU parallel computing: Programming language, debugging tools and data structures. Frontiers of Electrical and Electronic Engineering, 7(1):5–15, 2012. [ bib | DOI ]

Iason Oikonomidis, Nikolaos Kyriazis, and Antonis Argyros. Efficient Model-based 3D Tracking of Hand Articulations using Kinect. In Proceedings of the British Machine Vision Conference, pages 101.1–101.11. BMVA Press, September 2011. [ bib | DOI ]

Andrew Thall. Fast Mersenne Prime Testing on the GPU. In Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units, GPGPU-4, pages 6:1–6:8, New York, NY, USA, March 2011. ACM. [ bib | DOI ]

Wu-chun Feng, Yong Cao, Debprakash Patnaik, and Naren Ramakrishnan. Temporal Data Mining for Neuroscience. In Wen-mei W. Hwu, editor, GPU Computing Gems, volume 1, chapter 15, pages 211–227. Morgan Kaufmann, February 2011. [ bib | DOI ]

Chun-Chieh Chiu, Guo-Heng Luo, and Shyan-Ming Yuan. A decision tree using CUDA GPUs. In Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services, iiWAS '11, pages 399–402, New York, NY, USA, 2011. ACM. [ bib | DOI ]

Eli Koffi Kouassi, Toshiyuki Amagasa, and Hiroyuki Kitagawa. Efficient Probabilistic Latent Semantic Indexing using Graphics Processing Unit. Procedia Computer Science, 4(0):382–391, 2011. Proceedings of the International Conference on Computational Science. [ bib | DOI ]

Chia-Feng Lin and Shyan-Ming Yuan. The Design and Evaluation of GPU Based Memory Database. In 2011 Fifth International Conference on Genetic and Evolutionary Computing (ICGEC), pages 224–231, 2011. [ bib | DOI ]

Weidong Sun, Weiwei Wang, and Zongmin Ma. Fast Short Exact Repeats Finding on GPU. In 2010 3rd International Conference on Biomedical Engineering and Informatics (BMEI), volume 5, pages 2197–2200, October 2010. [ bib | DOI ]

Dragan Bošnački, Stefan Edelkamp, Damien Sulewski, and Anton Wijs. GPU-PRISM: An extension of PRISM for General Purpose Graphics Processing Units. In 2010 Ninth International Workshop on Parallel and Distributed Methods in Verification/ and Second International Workshop on High Performance Computational Systems Biology, pages 17–19, September 2010. [ bib | DOI ]

Anjul Patney, Stanley Tzeng, and John D. Owens. Fragment-Parallel Composite and Filter. Computer Graphics Forum (Proceedings of the Eurographics Symposium on Rendering), 29(4):1251–1258, June 2010. [ bib | DOI | http ]

Kirill Garanzha and Charles Loop. Fast Ray Sorting and Breadth-First Packet Traversal for GPU Ray Tracing. Computer Graphics Forum, 29:289–298, May 2010. [ bib | DOI ]

W. Bailer, H. Fassold, F. Lee, and J. Rosner. Tracking and Clustering Salient Features in Image Sequences. In 2010 Conference on Visual Media Production (CVMP), pages 17–24, 2010. [ bib | DOI ]

R. Cabido, A. Duarte, A. S. Montemayor, and J. J. Pantrigo. Differential Evolution for Global Optimization on GPU. In International Conference on Metaheuritic and Nature Inspired Computing, 2010. [ bib ]

Yupeng Guo, Xiaoguang Liu, Gang Wang, Fan Zhang, and Xin Zhao. An Improved Parallel MEMS Processing-Level Simulation Implementation Using Graphic Processing Unit. In Ching-Hsien Hsu, Laurence T. Yang, JongHyuk Park, and Sang-Soo Yeo, editors, Algorithms and Architectures for Parallel Processing, volume 6082 of Lecture Notes in Computer Science, pages 289–296. Springer Berlin Heidelberg, 2010. [ bib | DOI ]

D. Patnaik, S. P. Ponce, Yong Cao, and N. Ramakrishnan. Accelerator-Oriented Algorithm Transformation for Temporal Data Mining. In Sixth IFIP International Conference on Network and Parallel Computing (NPC '09), pages 93–100, October 2009. [ bib | DOI ]

Deyuan Qiu, Stefan May, and Andreas Nüchter. GPU-accelerated Nearest Neighbor Search for 3D Registration. In ICVS 2009: Proceedings of the 7th International Conference on Computer Vision Systems, October 2009. [ bib ]

Apeksha Godiyal, Jared Hoberock, Michael Garland, and John C. Hart. Rapid Multipole Graph Drawing on the GPU. In Proceedings of the 16th International Symposium on Graph Drawing, volume 5417 of Lecture Notes in Computer Science, pages 90–101. Springer, September 2009. [ bib | DOI ]

Hagen Peters, Ole Schulz-Hildebrandt, and Norbert Luttenberger. Fast comparison-based in-place sorting with CUDA. In Eighth International Conference on Parallel Processing and Applied Mathematics, September 2009. [ bib ]

Markus Billeter, Ola Olsson, and Ulf Assarsson. Efficient Stream Compaction on Wide SIMD Many-Core Architectures. In Proceedings of High Performance Graphics 2009, pages 159–166, August 2009. [ bib | DOI ]

Jared Hoberock, Victor Lu, Yuntao Jia, and John C. Hart. Stream Compaction for Deferred Shading. In Proceedings of High Performance Graphics 2009, pages 173–180, August 2009. [ bib | DOI ]

Anjul Patney, Mohamed S. Ebeida, and John D. Owens. Parallel View-Dependent Tessellation of Catmull-Clark Subdivision Surfaces. In Proceedings of High Performance Graphics 2009, pages 99–108, August 2009. [ bib | DOI | http ]

Vibhav Vineet, Pawan Harish, Suryakant Patidar, and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. In Proceedings of High Performance Graphics 2009, pages 167–171, August 2009. [ bib | DOI ]

Sean P. Ponce. Towards Algorithm Transformation for Temporal Data Mining on GPU. Master's thesis, Department of Computer Science, Virginia Polytechnic Institute and State University, 7 July 2009. [ bib ]

Nadathur Satish, Mark Harris, and Michael Garland. Designing Efficient Sorting Algorithms for Manycore GPUs. In Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium, May 2009. [ bib ]

Sean Peter Dukehart. GPU Random Walkers for Iterative Image Segmentation. Master's thesis, Department of Computer Science, University of Maryland Baltimore County, February 2009. [ bib ]

Christian Eisenacher, Quirin Meyer, and Charles Loop. Real-Time View-Dependent Rendering of Parametric Surfaces. In I3D '09: Proceedings of the 2009 Symposium on Interactive 3D Graphics and Games, pages 137–143, February/March 2009. [ bib | DOI ]

Linh Ha, Jens Krüger, and Cláudio T. Silva. Fast Four-Way Parallel Radix Sorting on GPUs. Computer Graphics Forum, 28(8):2368–2378, 2009. [ bib | DOI | http ]

B. Huang, Jinlan Gao, and Xiaoming Li. An Empirically Optimized Radix Sort for GPU. In 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, pages 234–241, 2009. [ bib | DOI ]

Yannick Allusse, Patrick Horain, Ankit Agarwal, and Cindula Saipriyadarshan. GpuCV: A GPU-Accelerated Framework for Image Processing and Computer Vision. In Advances in Visual Computing, volume 5359 of Lecture Notes in Computer Science, pages 430–439. Springer, December 2008. [ bib | DOI ]

Anjul Patney and John D. Owens. Real-Time Reyes-Style Adaptive Surface Subdivision. ACM Transactions on Graphics, 27(5):143:1–143:8, December 2008. [ bib | DOI | http ]

Shubhabrata Sengupta, Mark Harris, and Michael Garland. Efficient Parallel Scan Algorithms for GPUs. Technical Report NVR-2008-003, NVIDIA Corporation, December 2008. [ bib | http ]

Kun Zhou, Qiming Hou, Rui Wang, and Baining Guo. Real-time KD-tree Construction on Graphics Hardware. ACM Transactions on Graphics, 27(5):126:1–126:11, December 2008. [ bib ]

George Stantchev, William Dorland, and Nail Gumerov. Fast parallel Particle-To-Grid interpolation for plasma PIC simulations on the GPU. Journal of Parallel and Distributed Computing, 68(10):1339–1349, October 2008. [ bib | DOI | http ]

Qiming Hou, Kun Zhou, and Baining Guo. BSGP: Bulk-Synchronous GPU Programming. ACM Transactions on Graphics, 27(3):19:1–19:13, August 2008. [ bib ]

Jike Chong, Youngmin Yi, Arlo Faria, Nadathur Satish, and Kurt Keutzer. Data-Parallel Large Vocabulary Continuous Speech Recognition on Graphics Processors. In Proceedings of the 1st Annual Workshop on Emerging Applications and Many Core Architecture (EAMA), pages 23–35, June 2008. [ bib | .html ]

Alexander Ladikos, Selim Benhimane, and Nassir Navab. Efficient Visual Hull Computation for Real-Time 3D Reconstruction using CUDA. In CVPRW '08: Computer Vision and Pattern Recognition Workshops, pages 1–8, June 2008. [ bib | DOI ]

Dominique Aubert, Mehdi Amini, and Romaric David. A Particle-Mesh Integrator for Galactic Dynamics Powered by GPGPUs. In Proceedings of the 9th International Conference on Computational Science, volume 5544 of Lecture Notes in Computer Science, pages 874–883. Springer, May 2008. [ bib | DOI ]

Kun Zhou, Minmin Gong, Xin Huang, and Baining Guo. Highly Parallel Surface Reconstruction. Technical Report MSR-TR-2008-53, Microsoft Research, 1 April 2008. [ bib ]

Shubhabrata Sengupta, Mark Harris, Yao Zhang, and John D. Owens. Scan Primitives for GPU Computing. In Graphics Hardware 2007, pages 97–106, August 2007. [ bib | http ]