-

How To Perform Genome-Wide Association Analysis (GWAS) For Absolute Beginners: From Raw Variants to Disease-Associated Loci Using PLINK
Introduction: Understanding Genome-Wide Association Studies After successfully calling variants from whole genome sequencing data (covered in Part 1 of our WGS series), you now have VCF files containing millions of genetic variants across multiple individuals. But which of these variants contribute to disease risk or influence quantitative traits? This is where Genome-Wide Association Studies (GWAS)
//
-

How to Cluster RNA-seq Data to Uncover Gene Expression Patterns: Hierarchical and K-means Methods for Absolute Beginners
Introduction: Understanding Clustering in RNA-seq Analysis In the vast landscape of gene expression data, patterns often hide in plain sight. Among thousands of genes measured simultaneously, groups of genes may share similar expression patterns across samples, suggesting coordinated biological functions or responses. Clustering analysis serves as a powerful computational microscope that brings these hidden patterns
//
-

How To Analyze Whole Exome Sequencing Data For Absolute Beginners: From Raw Reads to High-Quality Variants, Mutations, and CNVs
Introduction: Understanding Whole Exome Sequencing vs. Whole Genome Sequencing What is Whole Exome Sequencing (WES)? Whole Exome Sequencing (WES) is a targeted sequencing approach that focuses specifically on the protein-coding regions of the genome, known as exons. While the human genome contains approximately 3.2 billion base pairs, the exome represents only about 1-2% of this
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6-2: Identifying Tumor Copy Number Variants Using CNVkit
Introduction: Understanding Tumor Copy Number Variants Cancer is fundamentally a disease of genomic instability, where normal cells accumulate mutations that drive uncontrolled growth and metastasis. Among these mutations, somatic copy number alterations (SCNAs) – also known as tumor CNVs – play a pivotal role in cancer initiation, progression, and treatment resistance. This tutorial builds upon
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6: Identifying Germline Copy Number Variants
Introduction: Understanding Copy Number Variants While single nucleotide variants (SNVs) and small insertions/deletions (indels) capture much of the attention in genomic analysis, they represent only part of the story of human genetic variation. Copy Number Variants (CNVs) – duplications and deletions of large segments of DNA – play an equally important role in human genetics,
//
-

How to Perform Master Regulator Analysis on RNA-seq Data Using RegEnrich and RTN – A Complete Beginner’s Guide
Discover the transcription factors controlling gene expression changes in your RNA-seq experiments Introduction: Understanding Master Regulator Analysis In the intricate symphony of gene regulation, not all transcription factors play equal roles. Some act as “master regulators” – key transcription factors that orchestrate broad changes in gene expression programs, controlling entire networks of downstream genes. Identifying
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 5: Identifying Disease- or Patient-Specific Variants
Introduction: From Variants to Disease Genes After successfully calling variants in your whole genome sequencing samples (as covered in Part 1 of this series), you now face an exciting challenge: among the millions of genetic variants present in the human genome, which ones are actually responsible for disease? Every human genome contains approximately 4-5 million
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 4: Visualizing and Interpreting Somatic Mutations
Introduction: From Multiple VCF Files to Biological Insights This tutorial builds upon our previous whole genome sequencing analysis pipeline, specifically the mutation calling results from Part 2A: Matched Tumor-Normal Mutation Calling with Mutect2. You should now have multiple high-confidence VCF files from different tumor-normal pairs that need to be converted to MAF (Mutation Annotation Format)
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 3: Annotating SNVs and Mutations with Multiple Tools
A comprehensive step-by-step guide to understanding the functional impact of genomic variants using GATK Funcotator, Ensembl VEP, SnpEff, and ANNOVAR Introduction: From Variants to Biological Meaning After successfully identifying genomic variants using GATK (covered in Part 1) and discovering somatic mutations with Mutect2 (detailed in Part 2A), you now have VCF files containing thousands of
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 2B: Unmatched Sample Mutation Calling Strategies
Introduction: Real-World Mutation Calling Challenges Welcome to Part 2B of our somatic mutation analysis series! In Part 2A, we learned the gold standard approach using matched tumor-normal pairs. However, in real-world scenarios, you often face situations where matched normal samples aren’t available. Common Unmatched Sample Scenarios Clinical Archives: Historical tumor samples without corresponding normal tissuePopulation
//
Search
Subscribe
Categories
- bulk RNA-seq (27)
- chromatin accessibility (14)
- Database (4)
- Epigenetics (14)
- Genomics (10)
- HPC (5)
- Metagenomics (1)
- Quick Tips (1)
- RNA-seq (15)
- Scientific Programming (5)
- Single Cell Sequencing (15)
- Transcriptomics (28)
Recent Posts
- How to Analyze Single-Cell RNA-seq Data — Complete Beginner’s Guide Part 13: RNA Velocity Analysis with scVelo
- How to Analyze Single-Cell RNA-seq Data – Complete Beginner’s Guide Part 12: Build Gene Co-expression Networks Using hdWGCNA
- How to Analyze Single-Cell RNA-seq Data — Complete Beginner’s Guide Part 11: Copy Number Variation Analysis Using CopyKAT
- No More Command-Line Only: Run Jupyter Lab, RStudio, and VS Code Interactively in Your Browser on Any HPC Cluster with Pixi
Tags
Alternative Splicing Analysis ATAC-seq BAM ChIP-seq chromatin accessibility CNV DESeq2 Differential Expression edgeR FASTQ GATK Mutect2 gene expression heatmap HOMER HPC Isoform limma MACS2 MAF miRNA miRNA-seq MSigDB Normalization peak calling RNA-seq SLURM somatic mutations Transcript VCF whole genome sequencing



