ONLINE: Containers and workflows in bioinformatics
Bioinformatics tools often require installing different dependencies in a controlled environment. Containers allow you to logically package your application (e.g., a bioinformatics tool) together with libraries and other dependencies, providing isolated environments for running your software services. Containerised applications can be run in an isolated runtime environment independent of the actual environment (e.g., private data center, the public cloud, or even a developer’s personal laptop) in which the applications are running in. These are recently gaining popularity as a standard way to distribute, deploy, and run services by developers and system administrators. This course will focus on the deployment of containerised applications in HPC environment. The course will also introduce a modern workflows manager (i.e., nextflow ) to perform complex analysis in bioinformatics.
Expected learning in this course
In this basic course, you will be introduced to the fundamentals of container technology in addition to the selected examples of containerised bioinformatics applications. This basic understanding of containers is necessary to be able to work with bio applications in a containerised environment with different options and requirements.
More specifically, you will learn:
– Basic concepts of CSC supercomputing environment (Puhti)
– The essential concepts of using containers
– The containerised applications in bioinformatics
– The basics of running Singularity containers in HPC environment
– Basic introduction to Nextflow
After this course, one will be able to launch and work with containerised applications in HPC environment
Pre-requisites:
One should be comfortable working with the command line environment in Linux and able to use any common editors (e.g., vi, nano, or emacs, etc.) in order to get the maximum benefit from this course.
So, ideal candidates for this course are:
– Bioinformaticians or computer scientists with some bio-background
– Biologists with Linux skills and/or basics of HPC environment
Expected way of learning
– Lectures
– Hands-on exercises
Practicalities (More information will be updated here)
e-Lena e-learning platform will be used in the course.
The course will be taught as four half days from 9 AM to 12 PM
Program, 23rd, November
09:00 – 09:20 Course preliminaries
09:20 – 09:30 Warm up with HackMD Environment
09:30 – 10:00 Introduction to CSC HPC environment
10:00 – 10:30 Tutorial – HPC basics
10:30 – 10:40 Break
10:40 – 11:10 Fundamentals of containers
11:10 – 11:40 Tutorial – Hello-World example
11:40 – 12:00 wrap-up
Program, 24th, November
09:00 – 09:30 Using container images in HPC environment
09:30 – 10:30 Tutorials: Using Container images in HPC environment
10:30 – 10:40 Break
10:40 – 11:10 Containerised bio applications
11:10 – 11:40 Tutorial – Containerised bio applications
11:40 – 12:00 Warp up
Program, 25th, November
09:00 – 09:30 Converting docker images to singularity images
09:30 – 10:30 Tutorials: Converting docker images to singularity images
10:30 – 10:40 Break
10:40 – 11:10 Building singularity container images
11:10 – 11:40 Tutorials: Building singularity container images
11:40 – 12:00 Wrap -up
Program, 26th, November
09:00 – 09:30 Introduction to nextflow
09:30 – 10:00 Tutorial – hello-world example
10:00 – 10:30 Using singularity containers in nextflow
10:30 – 10:40 Break
10:40 – 11:00 Tutorial – Using singularity containers in nextflow
11:00 – 11:40 Running nextfflow at CSC
11:40 – 12:00 Wrap -up
Lecturers
Laxman Yetukuri and Ari-Matti Saren
Time
23.11.2021 - 26.11.2021
09:00 - 12:00