Stream Computing Workshop



The Stockholm Stream Computing Center and PRACE are organizing a workshop on stream computing. The workshop will offer an introduction to Open CL and stream/GPU programming. It will consist of lectures and hands-on experiences in using OpenCL on state-of-the-art stream processors It will also offer lectures on stream processor architectures and programming tools for stream processors and multi-core systems. Lectures reporting on successful use of stream processors in scientific applications will also be offered.

The workshop is suitable for scientist and graduate students with interest in exploiting stream processing for applications. It requires good programming experience.

The workshop will be held at KTH, Stockholm.
The Monday afternoon and Tuesday morning lectures will be held at Sydvästgalleriet (SVG), Osquars Backe 31, 3rd floor (the KTH Library building); the Wednesday morning lectures at room 304 at Teknikringen 14 (3rd floor); the labs on Tuesday and Wednesday will take place in the SAM lab at Teknikringen 14 (3rd floor).

The event is part of PRACE's training and education programme, which aims to prepare and initiate a sustainable and comprehensive European HPC education and training programme encompassing summer schools, winter schools, training workshops and training material. Video material from previous PRACE training events is available on

The Stockholm Stream Computing Center was formed in 2008 by scientists at KTH’s High Performance Computing Center, PDC, and the Center for Biomembrane Research (CBR) at Stockholm University.

The Partnership for Advance Computing in Europe (PRACE) is a project funded by the European Commission.

The program committee for this workshop consists of
Guillaume Colin de Verdiere, Commissariat l'Energie Atomique (CEA), Paris
Pekka Manninen, Finnish Center for Scientific Computation (CSC - IT Center for Science) Helsinki
Lennart Johnsson, PDC - Center for High-Performance Computing, KTH and Department of Computer Science, University of Houston
Erwin Laure, PDC - Center for High-Performance Computing, KTH
Erik Lindahl, CBR, Stockholm University
Peter Munger, National Supercomputing Center (NSC), Linkoping
Alan Simpson, Edinburgh Parallel Computing Center (EPCC), Edinburgh University
    • 1:00 PM 1:15 PM
      Opening 15m
    • 1:15 PM 2:00 PM
      Software acceleration for dummies - a survey of current technologies 45m
      Speeding up computational heavy software by replacing traditional CPUs with other types of programmable devices is becoming a more and more viable technique. GPUs, FPGAs and Cell processors, which are the most common ones, offer tremendous computing power, at lower cost and at lower power consumption than CPUs. Their supporting eco-systems have also matured considerable over the last years, making them increasingly interesting not just for experimental systems, but for a broad range of real life applications. This presentation will make a survey of the different technologies, of available hardware solutions and available programming tools.
      Speaker: Magnus Peterson (Synective Labs)
    • 2:00 PM 2:45 PM
      Ct - a new paradigm for data parallel computing 45m
      Intel's Ct Technology provides a comprehensive set of data parallel abstractions that greatly simplify the task of writing parallel applications and at the same time deliver forward-scaling performance across a wide variety of multi- and manycore platforms. Ct enables developers to incrementally introduce data parallelism in C++ programs, while avoiding ther risks of data races and/or deadlocks. The talk introduces the Ct concepts and discuses examples on how to use Ct effectively for HPC applications.
      Speaker: Hans-Christian Hoppe (Intel)
    • 2:45 PM 3:30 PM
      AMD & OpenCL: A Balanced Platforms Approach to Heterogeneous Computation 45m
      GPUs have been proven to offer benefits for accelerating computationally intensive algorithms. However, CPUs are by no means being made any less relevant for HPC workloads. Many applications require leveraging both the GPU and CPU to enable the greatest acceleration, i.e. heterogeneous computing. This presentation will provide an overview of AMD's vision for heterogeneous computing.
      Speaker: James Hrica (AMD)
    • 3:30 PM 4:00 PM
      Coffee 30m
    • 4:00 PM 4:45 PM
      OpenCL and GPU High Level Programming Tools 45m
      OpenCL is an initiative launched by Apple to ensure application portability accross various types of GPUs. It aims at being an open standard (royalty free and vendor neutral) developed by the Khronos OpenCL working group ( This talk will give an overview of the language to program GPUs, portability and its place in and along with other programming tools.
      Speaker: Stéphane Bihan (CAPS)
    • 4:45 PM 5:30 PM
      Molecular dynamics simulations on GPUs 45m
      The Open Molecular Mechanics (OpenMM) library provides tools and consistent hardware-agnostic API for modern molecular modeling simulations with emphasis on hardware acceleration (currently GPUs only). The talk will present an overview of the platform and discuss some of the implemented algorithms, how they differ from the standard CPU ones and how performance is affected by them.
      Speakers: Rossen Apostolov (Stockholm University), Szilard Pall (Stockholm University)
    • 5:30 PM 6:00 PM
      Nvidia/CUDA - applicability and problems by the example 30m
      As part of the PRACE, a few kernels from the Euroben-benchmark had to be ported to all the available PRACE prototype architectures. This presentation focuses on the Nvidia/CUDA port and gives an overview of the used Nvidia Hardware, the experience with CUDA and the available toolkit. It illustrates the porting effort, various problems and the results by the example with three of those kernels, namely a dense matrix-matrix multiplication, a sparse matrix-vector multiplication and a 1D fast Fourier transformation.
      Speaker: Hans Hacker (LRZ)
    • 9:00 AM 12:30 PM
      Open CL Tutorial

      What is OpenCL?

      • Design Goals

      • Execution Model

      • Platform and Memory Models

      Resource Setup and Resource Allocation

      • Setup

      • Kernel

      • Work-Item / workgroups

      Kernel Execution

      • Execution and Synchronization

                            Programming with OpenCL™ C
      • Language Features

      • OpenCL API

      • Built-in Functions

    • 2:00 PM 5:30 PM
      Hands on Session
    • 9:00 AM 12:30 PM
      Open CL Tutorial

      OpenCL Architecture and Optimization on AMD GPUs

                   Harnessing the computational power of GPUs and CPUs
                       <li>OpenCL development on heterogeneous platform


                      <li>AMD’s balanced system approach.  ( Demos and


    • 2:00 PM 5:30 PM
      Hands on Session