High-Level GPU Programming (CSC-Vienna Scientific Cluster-EuroCC collaboration)

Modern HPC systems combine CPUs and accelerators such as GPUs or FPGAs, making code optimization for diverse platforms time-consuming. Cross-platform portability ecosystems provide a higher-level abstraction layer, simplifying parallel programming in shared memory environments. Examples include SYCL and Kokkos for C++. SYCL, an open standard by Khronos Group, offers a unified C++ layer for diverse devices, achieving parallel execution on CPUs, GPUs, FPGAs, and more. Kokkos Core, a C++ framework, enables high-performance applications across HPC platforms, addressing challenges of intricate node architectures. Kokkos supports various backend programming models like CUDA, HIP, SYCL, HPX, OpenMP, and C++ threads.

This training organized by  CSC in collaboration with VSC / EuroCC Austria introduces GPU programming using SYCL and Kokkos to write portable and performant accelerated applications. The course consists of lectures and hands-on sessions using LUMI, Mahti, and Intel DevCloud, featuring AMD, Nvidia, and Intel GPUs, respectively. At the end of the training, we also provide opportunity for the participants to apply the acquired knowledge to personal coding projects and real-world application scenarios.

Where and when:

Wednesday 14th – Friday 16th February

This is an on-premise event  at the CSC Training Facilities located on the premises of CSC at Keilaranta 14, Espoo, Finland.
While the event is on-premise, the SYCL I,II, and III lecturering is done remotely.

Learning outcome:

At the end of this training, participants will be able to:

  •  write hardware-agnostic code to express parallelism using SYCL and Kokkos that can run on CPUs and GPUs
  • manage memory across devices
  • do basic performance analysis
  • evaluate the drawbacks between different approches for programming GPUs


This course targets developers who know C++ and would like to learn how to program GPUs or for developers who are already doing GPU programming using a non-portable approach such like CUDA or HIP and would like to write performant code which runs on various computing platforms. In order to be able to follow the course the participants are expected to have basic familiarity with C++ concepts such as raw pointers, classes, structures, templates, lambdas, functors.

The content level of the course is broken down as: beginner’s – 70%, intermediate – 20%, advanced – 10%, community-targeted content – 0%.

Program (coarse grained):

Day 1, Wednesday 14.02, 9:00-17:00

09:00-11:00 Introduction to GPUs and GPU parallel programming model
11:00-12:00 Refresher of C++ concepts
12:00-13:00 Lunch break
13:00-16:45 SYCL I
16:45-17:00 Day 1 wrap-up

Day 2, Thursday 15.02, 9:00-17:00

09:00-12:00 SYCL II
12:00-13:00 Lunch break
13:00-15:00 SYCL III
13:00-16:45 Mahti and LUMI (SYCL installation, usage and exercises)
16:45-17:00 Day 2 wrap-up

Day 3 Friday 16.02, 9:00-17:00

09:00-11:00 Kokkos
11:00-12:00 Interoperability with third-party libraries,  and multi-gpu, multi-node programming
12:00-13:00 Lunch break
13:00-16:45 Bring your own code
16:45-17:00 Day 3 wrap-up & Course closing