Node-Level Performance Optimization
This course covers advanced topics on code optimization for x86 platforms (Intel and AMD CPUs). We discuss different techniques for analyzing and maximizing both single and multi-core performance within a single node. The topics inlude instruction-level parallelism, vectorization, and efficient utilization of cache and memory. The course consists of lectures and hands-on exercises.
Learning outcome
– Awareness of features and internal workings of x86 CPUs
– Ability to analyze and assess single-node performance
– Ability to vectorize computations
– Ability to optimize cache and memory access
Prerequisites
– Good knowledge of C/C++ or Fortran
– Good knowledge of threading using OpenMP
– Basic knowledge of modern CPU architectures
Agenda
Day 1
– Overview about performance engineering
– General overview of modern multicore CPU
– Main memory performance
– Performance analysis tools
Day 2
– Deeper dive into caches
– Detailed look into Intel and AMD CPUs
– Advanced vectorization
– Additional optimization topics
Deadline for registrations 3.5.2024
Time
23.5.2024 - 24.5.2024
09:00 - 16:30
Place
Life Science Center, Keilaranta 14
02150, Espoo
Price
Price
EUR 120 (+24% VAT) for Finnish Universities or institutions for higher education & Finnish state research institutions or government organizations
EUR 560 (+24% VAT) for Other