Tags » Cuda

Caffe with GPU support on Ubuntu 14.04

Kernel and GCC

3.8.x, please do not use 3.13.x 3.16.x 3.19.x
gcc 4.6.x, 4.8.x will have problem with kernel compiling

dpkg -i linux-headers-*.deb linux-headers-*.all.deb linux-image-*.deb
… 177 more words

Declaring device functions in headers

Recently I had a problem when trying to separate the declaration and implementation of device functions in .h and .cu files. When building the code, I would get an “Unresolved extern function” compiler error. 327 more words

Code Organization

Nvidia Jetson

Finally splashed out on the TK1 dev kit; color me impressed! The buyer is first greeted by a neat looking, minimalist cardboard box, which contains the board, power supply and micro-usb cord, for flashing the device. 219 more words


Using Python with Cuda for parallelization

When we were tasked to create a parallelized algorithm using cuda to tackle a problem of our choice, we hated the idea of having to use C. 559 more words


Preparing for classes

This year, again, my teaching is in the second semester. So now the holidays are over, I’m busy making final preparations for the coming semester. I’ll be teaching a class on computer architecture, specifically focusing on current parallel systems. 38 more words


Installing ArrayFire

ArrayFire is now opensource!!

So I wanted to install and test it on my computer. Since they don’t yet have the installer for the opensource version, I had to build it from source. 140 more words

Having fun measuring CUDA kernel execution times


When writing CUDA code, one usually has to measure the execution time of each kernel individually, to make sure that they are properly optimized. Nsight’s Performance Analysis tool has a great profiler which can do that in great detail, allowing the user to choose which kernels to profile, skip the first x invocations and a lot of other useful settings. 1,360 more words

Code Organization