gatk2017 - Training
CSC Koulutukset ja tapahtumat ovat muuttaneet
Löydät tulevat koulutukset ja tapahtumat osoitteesta www.csc.fi/asiakaskoulutus.
Tämä sivusto on arkistoversio eikä sitä enää päivitetä
Päiväys: | 13.09.2017 9:00 - 15.09.2017 17:00 |
Location details: | The lecture day 13.9 is organised in Biomedicum Helsinki 1 in Seminar room 3 at Haartmaninkatu 8. The hands-on days 14.-15.9 take place in the computer classroom Dogmi at CSC at Keilaranta 14, Espoo. |
Kieli: | english-language |
lecturers: |
Laura Gauthier (MIT Broad) Yossi Farjoun (MIT Broad) Soo Hee Lee (MIT Broad) |
Hinta: | The registration fee is 60 euros + VAT per day. The fee covers morning and afternoon coffees. DPBM students register for free, please contact Anita Tienhaara. |
Payment can be made with electronic invoicing, credit card, or direct bank transfer. Note that for electronic invoicing you need the operator and e-invoicing address (OVT code) of your organization. Please also note that invoice reference is needed for electronic invoicing in your organization, so please have this available when registering.
Practicalities: event-support@csc.fi
This workshop will focus on the core steps involved in calling variants with the Broad Institute's Genome Analysis Toolkit (GATK), using the "Best Practices" developed by the GATK team. You will learn why each step is essential to the variant discovery process, what are the operations performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. This course is organized in collaboration with the Doctoral Programme in Biomedicine (DPBM) of University of Helsinki.
This workshop highlights key functionalities such as the germline GVCF workflow for joint variant discovery in cohorts, RNAseq specific processing, and new somatic variant discovery capabilities in GATK4. It also mentions the use of pipelining tools to assemble and execute GATK workflows.
Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to nonhuman data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.
Program outline
The workshop is composed of one day of lectures and two days of hands-on training, structured as follows:
Wed 13.9 Lectures: Rationale, theory and application of the GATK Best Practices for Variant Discovery in high-throughput sequencing data.
Thu 14.9 AM: hands-on: Germline variant discovery (SNPs, Indels)
Thu 14.9 PM: hands-on: Germline variant filtering (SNPs, Indels)
Fri 15.9 AM: hands-on: Somatic variant discovery (SNPs, Indels)
Fri 15.9 PM: hands-on: Somatic variant discovery (CNV)
In the optional hands-on sessions focused on analysis, we walk attendees through exercises that teach them how to manipulate the standard data formats involved in variant discovery and how to apply GATK tools appropriately to common use cases and data types. In the course of these exercises, we demonstrate useful tips and tricks for interacting with GATK and Picard tools, dealing with problems, and using third-party tools such as Samtools, IGV, RStudio and RTG Tools.
Detailed agenda of the Best Practices lectures 13.9.2017 (9:00-16:50)
Introduction to variant discovery analysis
GATK Best Practices workflows
Pipelining with Cromwell and the Broad's Workflow Description Language (WDL)
10:30-11:00 Coffee Break
Marking Duplicates
Base Recalibration
12:00-13:00 Lunch Break
Variant Calling and Joint Genotyping
Filtering variants with VQSR
Genotype Refinement Workflow
14:20-14:50 Coffee Break
Callset Evaluation
Somatic SNV and indel discovery with MuTect2 in GATK4
Somatic CNV discovery with GATK4
Target audience and prerequisites
The lecture day of the workshop is aimed at a mixed audience of people who are new to the topic of variant discovery or to GATK, seeking an introductory course into the tools, or who are already GATK users seeking to improve their understanding of and proficiency with the tools. Attendees should already be familiar with the basic terms and concepts of genetics and genomics.
The hands-on days are aimed at novice to intermediate users who are seeking detailed guidance with GATK and related tools. Basic familiarity with the command line environment is required.
After this course you should be able to:
- Understand the overall variant discovery workflow rationale and requirements
- Understand key methods and functionalities in light of the latest research
- Understand key differences between germline and somatic variant discovery approaches
- Apply analysis tools and Best Practices workflows to a real data set
- Interpret analysis results and troubleshoot common problems
During this course you will learn about:
- Preprocessing of high-throughput DNA- and RNA-seq data
- Variant discovery (germline and somatic short variants, somatic CNV)
- Germline variant filtering and evaluation
Course material
The lecture slides, hands-on exercises and course data are available in the course material folder.
Participants who were allocated extra seats to attend the hands-on sessions with their own laptop need to install the course software beforehand. Please also download the course data before the course.