An Introduction to GPU computing (CUDA) on Porteus

This is meant as an introduction on how to do some serious number crunching on the GPU using Porteus and the CUDA programming framework. We will start from the very basics, ie how to set up everything and all the way to compiling some simple examples.This post will be updated as needed. However, please keep in mind that for any questions related to CUDA programming itself you should use the NVIDIA forum instead.

1) Requirements
2) Download the necessary files
3) Installation of the framework and the SDK
4) Build the shared libs
5) Do some tests
6) Have some fun!

All the tests during the writing of this post were performed on an HP dv8-1190eo laptop running Porteusx64, 8GB RAM with a Geforce 230M GT graphics card.

1) Requirements

i) A laptop/PC running Porteus
ii) Some familiarity with C/C++ and programming in general (can't help you in this!)
iii) A supported  NVIDIA graphics card. The i486 version of the driver can be found here, and the x64 version of the driver can be found here.

If you are not sure whether your card is supported, a quick way to check is to see if your laptop or PC has an NVIDIA sticker that says something like: "GEFORCE with CUDA" or  just go to and check if your card is supported.  

2) Download the necessary files (take care to download the x64 versions where available)
You will need the following files freely available from NVIDIA's site:

The NVIDIA driver found at the Porteus site includes the CUDA driver, so you only need to download the Toolkit and the SDK.

i) Grab the CUDA toolkit (it's the Fedora version, but it works OK):

ii) Grab the SDK:
The SDK contains tons of examples and it is vital if you want to do any serious work.

iii) Grab any extra Guides or Libraries you want. The GPU-accelerated LAPACK libraries could be useful but it will cover that in a future post.

3) Installation of the framework and the SDK

In order to install the files open a console in the folder where you downloaded them and type:

  sh ./

The command above will install the toolkit. When prompted choose the recommended path. You can of course choose a different one, but that complicates thing

 sh ./

The command above will install the SDK. Again when prompted choose the recommended path. You might get a couple of warnings during the installation (I did) but everything should be OK.

iii) Go to your .bash_profile file (it is hidden in your Home folder) and add the following lines:



export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

export PATH


and yes there is some redundancy but not including some of the lines might give some errors during the tests. Anyway, remember to change lib64 to just lib if you have the x32 version.

iv) REBOOT!!!

4) Build the shared and common libs
i) After the reboot, go to:

Type make to build the shared libs. These are not necessary to use CUDA, but they are required for the examples. The result should be a lib: libshrutil_x86_64.a in the lib folder.

ii) Go to:


to build the common libs. Again, this is not necessary for CUDA, just the examples. The result should be a lib: libcutil_x86_64.a in the lib folder.

5) Do some tests
The folder /root/NVIDIA_GPU_Computing_SDK/C/src/ contains the examples.
So, go to the folder named bandwidthTest/ and in a console type


This will build the example and if no errors have occured a binary will have been created in /root/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/ ,  so go to this folder and run the binary bandwidthTest. The result should be something like:

./bandwidthTest Starting...
 Running on...
 Device 0: GeForce GT 230M
 Quick Mode
 Host to Device Bandwidth, 1 Device(s), Paged memory
 Transfer Size (Bytes)    Bandwidth(MB/s)
 33554432            3354.7
 Device to Host Bandwidth, 1 Device(s), Paged memory
 Transfer Size (Bytes)    Bandwidth(MB/s)
 33554432            3855.3
 Device to Device Bandwidth, 1 Device(s)
 Transfer Size (Bytes)    Bandwidth(MB/s)
 33554432            20571.4
 [bandwidthTest] - Test results:

The critical part is where is says PASSED which obviously means that everything is working OK.

6) Have some fun!

Try compiling some of the other examples, but keep in mind that some of them have extra requirements, like the OpenGL lib etc.

Report any problems or questions in this thread:

Compilation and usage of custom Porteus kernel

Before you start: make a backup of your existing Porteus installation!

WARNING: after upgrading, all kernel dependent Porteus modules like VirtualBox or proprietary GPU drivers will need to be recompiled against the new kernel version too.
WARNING: If you are changing something in the kernel config and recompiling the kernel once again, you may need to replace all of the kernel modules (M) in the ramdisk and 000-kernel.xzm accordingly.

Hardware requirements:
More than 2GB of memory if you are running Porteus with the copy2ram cheatcode and compiling with the sources in your live filesystem, and
1,0GB of free space on your usb stick or hard drive if you are building up a maximum compatibility kernel (i.e., all options enabled), so that you can save your kernel sources.  Note that if you are compiling from within Porteus, it is best to place the source code in your live filesystem (this article uses the /root/ directory) -- however, if you have 2GB of RAM or less and you are compiling a maximum compatibility kernel, it is best to place the source code on a hard drive or usb stick (formatted with a linux filesystem) while it is compiling.  Otherwise, you may run out of room in aufs and your compile will fail.

Let's start:

Section I Kernel

First of all make sure that Porteus development package is activated:

activate /mnt/sdb1/porteus/optional/005-devel.xzm

Download your new kernel version and unpack it somewhere:

tar -xf linux-3.7.8.tar.xz -C /root/

Get the aufs patch (you must have 'git' utility installed on your system) for 3.7.x kernel with script below:


mkdir /tmp/aufs$$
cd /tmp/aufs$$
git clone git:// aufs3-standalone.git
cd aufs3-standalone.git
# uncomment line below to get aufs for stable kernel
git checkout origin/aufs3.7
# uncomment line below to get aufs for latest -rc kernel
#git checkout origin/aufs3.x-rcN
mkdir ../a ../b
cp -r {Documentation,fs,include} ../b
rm ../b/include/uapi/linux/Kbuild 2>/dev/null || rm ../b/include/linux/Kbuild
cd ..
diff -rupN a/ b/ > /root/linux-3.7.8/aufs.patch

# extra patches:
cat aufs3-standalone.git/*.patch >> /root/linux-3.7.8/aufs.patch

# cleanup:
rm -r /tmp/aufs$$

(in case of using different kernel version plase change 'git checkout origin/aufs3.7' command to for example 'git checkout origin/aufs3.9')

Patch the kernel:

cd /root/linux-3.7.8
patch -p1 < aufs.patch

Once you have the kernel patched, you need to configure it. The best way is to use the old Porteus kernel config file. The configs for Porteus are compiled as a kernel module, so you have to modprobe it and unzip it to the directory where your new kernel sources are before you can apply them:

modprobe configs && zcat /proc/config.gz > .config
make oldconfig

If you are not sure which option to choose, it is best to keep the enter key pressed (default options are usually safe).

Check if your configuration is correct:

make menuconfig

Navigate to "File systems" menu and make sure that FUSE will be compiled in (*). Then go to -> "Miscellaneous filesystems". Aufs and Squashfs must also be compiled in (*), as well as xz compression for Squashfs (*)

Make sure that kernel supports initrd compressed with xz.
Mark other drivers and features as you like :D

Now it's time to build a kernel, so:

make && make modules_install && make firmware_install

It's going to take a long time, so you'd better grab a beer :)

(Note:  If you have a dual core processor, you can run this with 'make -j3' to speed things up a bit)

If no errors are reported you can copy your shiny new kernel to the Porteus /boot directory:

cp arch/x86/boot/bzImage /mnt/sdb1/boot/syslinux/vmlinuz


  1. It's a good idea it to store your compiled sources somewhere (it's around 1GB when uncompressed), in case you need to add or change something.  If you have a backup you won't have to go through the whole process once again, and compilation will be much faster.
  2. If you use Porteus on different machines try to compile as many drivers as possible as modules (M), so the kernel won't be so bloated (my gentoo kernel is only 2MB when stripped to the maximum)
  3. The best place for compilation is RAM (fastest). Boot porteus without changes= cheatcode or use tmpfs instead:


Section II Updating 000-kernel.xzm module with new drivers


Now we need to get rid of the old drivers from 000-kernel.xzm, so:


cp -r /mnt/live/memory/images/000-kernel.xzm/ /root/000-kernel
rm -r /root/000-kernel/lib/modules/*
rm -r /root/000-kernel/lib/firmware/*
cp -r /mnt/live/memory/changes/lib/firmware /root/000-kernel/lib
cp -r /mnt/live/memory/changes/lib/modules/your-new-kernel-version /root/000-kernel/lib/modules
rm /mnt/sdb1/porteus/base/000-kernel.xzm
dir2xzm /root/000-kernel/ /mnt/sdb1/porteus/base/000-kernel.xzm

Reboot and enjoy the new kernel :)


Section III Updating initial ramdisk for pxe boot

To create new initrdpxe.xz please use following script:


cd `pwd`
mkdir pxe
cd pxe
cp -a --parents /lib/modules/`uname -r`/kernel/drivers/net/ethernet .
for x in dca mii libphy sungem_phy ssb uio crc-ccitt; do
  pth="/lib/modules/`uname -r`/$(grep $x.ko: /lib/modules/`uname -r`/modules.dep | cut -d: -f1)"
  cp -a --parents $pth .
depmod -b .
find | cpio -H newc -o | xz --check=crc32 --x86 --lzma2 > ../../initrdpxe.xz
cd ..
rm -r pxe
mv ../initrdpxe.xz .
mv /root/initrd.xz /mnt/sdb1/boot/syslinux/pxelinux.cfg/initrdpxe.xz

Good luck!