gatk2019 - Training
CSC's trainings and events have moved
Find our upcoming trainings and events at www.csc.fi.
This site is an archive version and is no longer updated.
Date: | 14.05.2019 9:00 - 17.05.2019 17:00 |
Location details: | The lecture day 14.5 is organised in Haartman Institute in Lecture hall (luentosali) 2 at Haartmaninkatu 3. The hands-on days 15.-17.5 take place in the computer classroom Dogmi at CSC at Keilaranta 14, Espoo. The best way to reach us is by public transportation; more detailed travel tips are available. |
Language: | english-language |
lecturers: |
Geraldine Van der Auwera (Broad Institute) Bhanu Gandham (Broad Institute) Audrey Smirnov (Broad Institute) |
Price: | The registration fee is 60 euros + VAT per day. The fee covers morning and afternoon coffees. Students from Integrative Life Science (ILS) doctoral program can register for free. NOTE: Students from the following graduate schools can register for free for the lecture day: KLTO, DPBM, CVM, FinDos, DPDR, DocPop, and Brain& Mind. |
Payment can be made with electronic invoicing, credit card, or direct bank transfer. Note that for electronic invoicing you need the operator and e-invoicing address (OVT code) of your organization. Please also note that invoice reference is needed for electronic invoicing in your organization, so please have this available when registering.
Practicalities: event-support@csc.fi
This workshop will focuss on the core steps involved in calling variants with the Broad Institute's Genome Analysis Toolkit (GATK), using the "Best Practices" developed by the GATK team. You will learn why each step is essential to the variant discovery process, what are the operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. This course is organized in collaboration with Doctoral Programme in Integrative Life Science (ILS) of University of Helsinki.
This workshop highlights key functionalities such as the germline GVCF workflow for joint variant discovery in cohorts, RNAseq specific processing, and new somatic variant discovery capabilities in GATK4. It also mentions the use of pipelining tools to assemble and execute GATK workflows.
Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to nonhuman data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.
Target audience and prerequisites
The lecture day of the workshop is aimed at a mixed audience of people who are new to the topic of variant discovery or to GATK, seeking an introductory course into the tools, or who are already GATK users seeking to improve their understanding of and proficiency with the tools. Attendees should already be familiar with the basic terms and concepts of genetics and genomics.
The hands-on days are aimed at novice to intermediate users who are seeking detailed guidance with GATK and related tools. Basic familiarity with the command line environment is required.
Learning objectives
After this course you should be able to:
-
Understand the overall variant discovery workflow rationale and requirements
-
Understand key methods and functionalities in light of the latest research
-
Understand key differences between germline and somatic variant discovery approaches
-
Apply analysis tools and Best Practices workflows to a real data set
-
Interpret analysis results and troubleshoot common problems
Aims
During this course you will learn about:
-
Preprocessing of high-throughput DNA- and RNA-seq data
-
Variant discovery (germline and somatic short variants, somatic CNV)
-
Germline variant filtering and evaluation
Recommended reading for obtaining credits:
Students wishing to get 2 ECTS from course are expected to read at least three articles, one from GATK paper and any two from review articles.
GATK paper:
https://currentprotocols.onlinelibrary.wiley.com/doi/abs/10.1002/0471250953.bi1110s43
Reviews:
Variant annotation: https://www.nature.com/articles/nrg.2017.52.pdf
Mendelian genetics: https://www.nature.com/articles/nrg.2017.116
Somatic mutations: https://www.nature.com/articles/nrg.2017.8
Day 1 (Tue,14.05 at Haartman institute) : Introduction to Genomic Analysis (lectures on main concepts)
Morning (9:00am - 12:00pm)
- Opening Remarks
- Introduction to Sequencing Data
- Introduction to Data Preprocessing
- Introduction to Variant Discovery
- Introduction to Germline variant discovery
- Introduction to somatic variant discovery
Afternoon (1:00pm - 4:00pm)
- Introduction to Pipelining Platforms
- Terra Orientation
- Case Study
Day 2 (Wed, 15.05 at CSC) : Germline Short Variant Discovery
Morning (9:00am - 12:00pm)
- Recap on Germline Variant Discovery
- HaplotypeCaller
- Joint Calling
- Germline Variant Discovery Tutorial (hands-on)
Afternoon (1:00pm - 4:00pm)
- Variant Filtering
- Genotype Refinement
- Callset Evaluation
- Germline Hard Filtering Tutorial (hands-on)
Day 3 (Thu ,16.05 at CSC) : Somatic Variant Discovery
Morning (9:00am - 12:00pm)
- Recap on Somatic Variant Discovery
- Somatic SNVs and Indels
- GATK4 Mutect2 Tutorial (hands-on)
Afternoon (1:00pm - 4:00pm)
- Somatic CNAs
- GATK4 Somatic CNA Tutorial (hands-on)
- GATK Best Practices for SNP/Indel Variant Calling in Mitochondria (demo)
Day 4 (Fri, 17.05 at CSC) : Pipelining with WDL and Cromwell
Morning (9:00am - 12:00pm)
- The Basics of WDL and Cromwell
- Hello World WDL Tutorial (hands-on)
- Docker
- Handling Sensitive data
Afternoon (1:00pm - 4:00pm)
- Advanced WDL Tutorial (hands-on)
- WDL puzzles
- Wrapup