Practical session
This workshop will explore rare variant analysis using the STAAR pipeline.
Practical Overview:
- Rare variant association analysis of WGS data
- WGS data preparation and annotation
- PC / Sparse Genetic Relatedness Matrix (GRM) generation
- Functionally-informed rare variant association analysis
- Conditional analysis
- Annotating rare variant analysis results
Practical Goals:
- To prepare 1000G WGS data in a Genomic Data Structure (GDS) format
- To functionally annotate 1000G WGS data into an annotated GDS (aGDS) format
- To generate ancestral principal components and sparse GRM (using FastSparseGRM)
- To perform a functionally-informed rare variant analysis using linear mixed model
- To perform conditional analysis for a significant rare variant set (mask) of interest
- To annotate a significant rare variant set (mask) of interest
Data Used
- 1000G High Coverage WGS:
- 1000G Phase 3:
- 1000G Original:
Programs Used
- GDS:
- aGDS:
- FastSparseGRM:
- STAARpipeline:
Workshop Material
Due to the time and resource limit of the workshop, we are not going to do a live demo of the preprocessing steps. Please refer to the file for the precoessing steps and the 1000G_scripts_part1 folder for the preprocessing steps.
In this workshop, we will be focusing on the R scripts in 1000G_scripts_part2
Let’s get started
Please go ahead and open the 05_RareVariants.ipynb Google colab notebook.