Website visitor statistics

We collect visitor statistics on the use of the site. The data is not personally identifiable and is only stored in the Matomo visitor analytics tool managed by CSC.

By accepting visitor statistics, you allow Matomo to use various technologies, such as analytics cookies and web beacons, to collect statistics about your use of the site.

Onsite

Node-Level Performance Optimization

This course covers advanced topics on code optimization for x86 platforms (Intel and AMD CPUs). We discuss different techniques for analyzing and maximizing both single and multi-core performance within a single node. The topics inlude instruction-level parallelism, vectorization, and efficient utilization of cache and memory. The course consists of lectures and hands-on exercises.

Learning outcome

– Awareness of features and internal workings of x86 CPUs
– Ability to analyze and assess single-node performance
– Ability to vectorize computations
– Ability to optimize cache and memory access

Prerequisites

– Good knowledge of C/C++ or Fortran
– Good knowledge of threading using OpenMP
– Basic knowledge of modern CPU architectures

Agenda

Day 1
– Overview about performance engineering
– General overview of modern multicore CPU
– Main memory performance
– Performance analysis tools

Day 2
– Deeper dive into caches
– Detailed look into Intel and AMD CPUs
– Advanced vectorization
– Additional optimization topics

Deadline for registrations 3.5.2024

Advanced Intermediate

Time

23.5.2024 - 24.5.2024
klo 09:00 - 16:30

Place

Life Science Center, Keilaranta 14
02150, Espoo

Price

Price
EUR 120 (+24% VAT) for Finnish Universities or institutions for higher education & Finnish state research institutions or government organizations
EUR 560 (+24% VAT) for Other

Required cookies