CBN (Computational Biology and Neurocomputing) seminars

Experiences from GPU implementation of ANN-simulator

by Roland Orre (NeuroLogic Sweden AB, Wish-IT AB)

Europe/Stockholm
RB35

RB35

Description
The GPU (Graphical Processing Unit) is a SIMD (Single Instruction Multiple Data) type of computer which is ubiquitous in almost all computers nowadays. It was found quite early middle 90-ies that this could be used as a general parallel number crunching accelerator. Over time the technology has been developed and specialized and this methodology is nowadays denoted CUDA (Compute Unified Device Architecture) where the NVIDIA CUDA API (Application Programming Interface) is most well known. What is presented is experiences of porting an artificial neural network simulator for modelling of populations of neurons and projections between populations written by Anders Lansner in C++. We start with a description of the NVIDIA CUDA computational model, then define what need to be considered when performing porting or new implementation of CUDA code. Basically the task is not hard, but there are a lot of traps to deal with on the way to get the code running with a verifiable result. Some practical hints will be given, as well as scaling behavior and speedup that can easily be achieved. We will also discuss how a rethinking of the original computational model could possibly speedup the performance towards the limit of the available GPUs.