Tags » Cuda

C++11 on GPUs

I recently had to check the support of C++11 constructs on compilers that target accelerators such as NVIDIA and AMD GPUs.

One option you have nowadays is to use CLang/LLVM to generate PTX/RXXX code directly from C++ but this approach introduces another set of issues, which can be addressed only by basically implementing your very own custom compiler or a set of compiler wrappers. 223 more words


CS4532 Concurrent Programming Homework 2


This assignment is based on deadlocks and deadlocks prevention algorithms. Also I have attached  the sample answers.

Problem Set

Homework 2



GPU Programming

Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 2

Compute the mathces

Here is a simple, purely brute-force algorithm for computing the join mentioned in Part 1.

Here is the entirely “CPU” implementation of the algorithm: 1,136 more words


Supercharging SQL Join with GTX Titan, CUDA C++, and Thrust: Part 1

This is a post in two parts:
Part 1 – The problem, solution setup, the algorithm.
Part 2 – (The juicy) Implementation details, discussion.

Suppose at the heart of the data layer of a web application there is a join like this: 785 more words


Compiling CUDA Projects with Dynamic Parallelism (VS 2012/13)

Just a quick note.
If you are starting from a template C++ CUDA project in VS 2012/2013, calling a kernel from a kernel (dynamic parallelism) would not compile: 60 more words


Which GPU(s) to Get for Deep Learning: My Experience and Advice for Using GPUs in Deep Learning

Why should I get a GPU?

I have been using GPUs for nearly two years now and for me it is again and again amazing to see how much speedup you get. 1,256 more words


CMake + CUDA + Dynamic parallelism (Nested parallelism)

set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS};-gencode arch=compute_35,code=sm_35)
link_directories( "C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v6.5\\lib\\x64")
target_link_libraries( DynamicParallel cudadevrt.lib )