Category: Genomics
-

How To Perform Genome-Wide Association Analysis (GWAS) For Absolute Beginners: From Raw Variants to Disease-Associated Loci Using PLINK
Introduction: Understanding Genome-Wide Association Studies After successfully calling variants from whole genome sequencing data (covered in Part 1 of our WGS series), you now have VCF files containing millions of genetic variants across multiple individuals. But which of these variants contribute to disease risk or influence quantitative traits? This is where Genome-Wide Association Studies (GWAS)
//
-

How To Analyze Whole Exome Sequencing Data For Absolute Beginners: From Raw Reads to High-Quality Variants, Mutations, and CNVs
Introduction: Understanding Whole Exome Sequencing vs. Whole Genome Sequencing What is Whole Exome Sequencing (WES)? Whole Exome Sequencing (WES) is a targeted sequencing approach that focuses specifically on the protein-coding regions of the genome, known as exons. While the human genome contains approximately 3.2 billion base pairs, the exome represents only about 1-2% of this
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6-2: Identifying Tumor Copy Number Variants Using CNVkit
Introduction: Understanding Tumor Copy Number Variants Cancer is fundamentally a disease of genomic instability, where normal cells accumulate mutations that drive uncontrolled growth and metastasis. Among these mutations, somatic copy number alterations (SCNAs) – also known as tumor CNVs – play a pivotal role in cancer initiation, progression, and treatment resistance. This tutorial builds upon
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6: Identifying Germline Copy Number Variants
Introduction: Understanding Copy Number Variants While single nucleotide variants (SNVs) and small insertions/deletions (indels) capture much of the attention in genomic analysis, they represent only part of the story of human genetic variation. Copy Number Variants (CNVs) – duplications and deletions of large segments of DNA – play an equally important role in human genetics,
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 5: Identifying Disease- or Patient-Specific Variants
Introduction: From Variants to Disease Genes After successfully calling variants in your whole genome sequencing samples (as covered in Part 1 of this series), you now face an exciting challenge: among the millions of genetic variants present in the human genome, which ones are actually responsible for disease? Every human genome contains approximately 4-5 million
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 4: Visualizing and Interpreting Somatic Mutations
Introduction: From Multiple VCF Files to Biological Insights This tutorial builds upon our previous whole genome sequencing analysis pipeline, specifically the mutation calling results from Part 2A: Matched Tumor-Normal Mutation Calling with Mutect2. You should now have multiple high-confidence VCF files from different tumor-normal pairs that need to be converted to MAF (Mutation Annotation Format)
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 3: Annotating SNVs and Mutations with Multiple Tools
A comprehensive step-by-step guide to understanding the functional impact of genomic variants using GATK Funcotator, Ensembl VEP, SnpEff, and ANNOVAR Introduction: From Variants to Biological Meaning After successfully identifying genomic variants using GATK (covered in Part 1) and discovering somatic mutations with Mutect2 (detailed in Part 2A), you now have VCF files containing thousands of
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 2B: Unmatched Sample Mutation Calling Strategies
Introduction: Real-World Mutation Calling Challenges Welcome to Part 2B of our somatic mutation analysis series! In Part 2A, we learned the gold standard approach using matched tumor-normal pairs. However, in real-world scenarios, you often face situations where matched normal samples aren’t available. Common Unmatched Sample Scenarios Clinical Archives: Historical tumor samples without corresponding normal tissuePopulation
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 2A: Matched Tumor-Normal Mutation Calling With Mutect2
Introduction to Matched Tumor-Normal Analysis Welcome back to our whole genome sequencing analysis journey! In Part 1, we learned how to process raw sequencing data and identify germline variants using GATK’s best practices. Now we’re ready to tackle the gold standard approach for detecting somatic mutations: matched tumor-normal analysis. What Are Somatic Mutations? Somatic mutations
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 1: From Raw Reads to High-Quality Variants Using GATK
Understanding Whole Genome Sequencing and Its Applications What is Whole Genome Sequencing (WGS)? Whole Genome Sequencing represents one of the most comprehensive approaches to studying genetic variation across an entire organism’s DNA. Unlike targeted sequencing approaches that focus on specific regions of interest, WGS captures virtually every nucleotide in the genome, providing an unbiased view
//
Search
Categories
- bulk RNA-seq (27)
- chromatin accessibility (14)
- Database (4)
- Epigenetics (14)
- Genomics (10)
- HPC (4)
- Metagenomics (1)
- Quick Tips (1)
- RNA-seq (10)
- Scientific Programming (4)
- Single Cell Sequencing (10)
- Transcriptomics (28)
Recent Posts
- How to Analyze Single-Cell RNA-seq Data – Complete Beginner’s Guide Part 7-2: Trajectory Analysis Using Slingshot
- How to Analyze Single-Cell RNA-seq Data from Patient-Derived Xenograft (PDX) Models — Complete Beginner’s Guide Part 8: Processing Human-Mouse Mixed Samples
- How to Analyze Single-Cell RNA-seq Data – Complete Beginner’s Guide Part 7: Trajectory and Pseudotime Analysis Using Monocle 3
- How to Convert BAM Files Back to FASTQ Files: A Practical Guide for NGS Analysis
Tags
Alternative Splicing Analysis ATAC-seq BAM cancer genomics ChIP-seq chromatin accessibility CNV DESeq2 Differential Expression edgeR FASTQ GATK Mutect2 gene expression heatmap HOMER HPC Isoform limma MACS2 MAF miRNA miRNA-seq MSigDB Normalization peak calling RNA-seq somatic mutations Transcript VCF whole genome sequencing



