Onsite

High Performance R

Are you using R but not sure if your R code makes the best use of the computing resources available? Would you like to learn to speed up R analyses by parallel computing, identify bottlenecks in your R scripts, or get tips on handling large datasets in R? Join our new course that focuses on using R efficiently and making most of R in a high performance computing environment.

The topics of this course include:

  • making use of the properties of R as a programming language to write efficient R code
  • exploring performance issues of R code by benchmarking and profiling processes and memory usage
  • parallel and distributed computing with R on both local and supercomputing resources

The topics will be covered using short lectures and/or demonstrations followed by hands-on exercises using RStudio and batch jobs on the supercomputer Puhti. The participants are welcome to bring their own R code (short script sections, not full projects) and a small data set (maximum 5 GB) to be used in the some of the exercises (but note that we do not solve any problems with the code itself).

Target audience:

This course is meant for anyone familiar with the basics of R and wanting to learn how to make their analyses in R more efficient and how to use R in a high performance computing environment. For example:

  • current users of RStudio in CSC’s Puhti web interface: move beyond RStudio and make most of the computing resources of the supercomputer
  • R users running R on their own computer so far: use your computer’s resources efficiently and learn to use R in a high performance computing environment
  • experienced users of another programming language and/or high performance computing: get familiar with the functional nature of the R language and its resource management

Where & when:

This is a two-day course from 9:00 to 16:00. The course will be offered on-site at the CSC Training Facilities (Keilaranta 14, Espoo, Finland). A Zoom link can be provided to participants not able to join on-site, but please note that this is not a hybrid course so online participants will be offered limited support. For participants joining the course on site in Espoo, lunch and a snack is included in the price.

Learning outcomes:

After attending this course, participants will be able to:

  • explore potential R code performance issues with benchmarking and profiling
  • understand the key properties of the R language and how they relate to the computer’s resource management
  • run R scripts with the batch job system on the supercomputer Puhti
  • get started with parallel and distributed computing with R

Pre-requisites:

Required:

  • basics of the R programming language
    • if you are a complete beginner with R and programming in general, we recommend the course Data Analysis with R instead

Useful to make the most of the course content but not required:

Lecturers:

Billy Braithwaite and Heli Juottonen (CSC)

Registration deadline: 16.9.2024

Price
EUR 120 (+24% VAT) for Finnish Universities or institutions for higher education & Finnish state research institutions or government organizations
EUR 560 (+24% VAT) for Other