High Performance R
Description
Are you using R but not sure if your R code makes the best use of the computing resources available? Would you like to learn to speed up R analyses by parallel computing, identify bottlenecks in your R scripts, or get tips on handling large datasets in R? Join our new course that focuses on using R efficiently and making most of R in a high performance computing environment.
The topics of this course include:
- making use of the properties of R as a programming language to write efficient R code
- exploring performance issues of R code by benchmarking and profiling processes and memory usage
- parallel and distributed computing with R on both local and supercomputing resources.
The topics will be covered using short lectures and/or demonstrations followed by hands-on exercises using RStudio and batch jobs on the supercomputer Puhti. The participants are welcome to bring their own R code (short script sections, not full projects) and a small data set (maximum 5 GB) to be used in the some of the exercises (but note that we do not solve any problems with the code itself).
Target audience
This course is meant for anyone familiar with the basics of R and wanting to learn how to make their analyses in R more efficient and how to use R in a high performance computing environment. For example:
- current users of RStudio in CSC’s Puhti web interface: move beyond RStudio and make most of the computing resources of the supercomputer
- R users running R on their own computer so far: use your computer’s resources efficiently and learn to use R in a high performance computing environment
- experienced users of another programming language and/or high performance computing: get familiar with the functional nature of the R language and its resource management.
Where & when
This is a two-day course from 9:00 to 16:00. The course will be offered on-site at the CSC Training Facilities (Keilaranta 14, Espoo, Finland). A Zoom link can be provided to participants not able to join on-site, but please note that this is not a hybrid course so online participants will be offered limited support. For participants joining the course on site in Espoo, lunch and a snack is included in the price.
Learning outcomes
After attending this course, participants will be able to:
- explore potential R code performance issues with benchmarking and profiling
- understand the key properties of the R language and how they relate to the computer’s resource management
- run R scripts with the batch job system on the supercomputer Puhti
- get started with parallel and distributed computing with R.
Pre-requisites
Required:
- basics of the R programming language
- if you are a complete beginner with R and programming in general, we recommend the course Data Analysis with R instead.
Useful to make the most of the course content but not required:
- basics of Linux (for example: very basics, more extensive tutorial)
- some experience in using the supercomputer Puhti, for example using the RStudio in the Puhti web interface, or the course CSC Computing Environment, Part 1: Basics or the corresponding self-learning materials.
Lecturers
Billy Braithwaite and Heli Juottonen (CSC).
Registration deadline: 16.9.2024
Time
26.9.2024 - 27.9.2024
09:00 - 17:00
Place
CSC Training Facilities, Keilaranta 14
02100, Espoo
Price
Price
EUR 120 (+25.5% VAT) for Finnish Universities or institutions for higher education, Finnish state research institutions or government organizations
EUR 560 (+25.5% VAT) for Other
Organizer
Billy Braithwaite and Heli Juottonen (CSC)