-

How to Cluster RNA-seq Data to Uncover Gene Expression Patterns: Hierarchical and K-means Methods for Absolute Beginners
Introduction: Understanding Clustering in RNA-seq Analysis In the vast landscape of gene expression data, patterns often hide in plain sight. Among thousands of genes measured simultaneously, groups of genes may share similar expression patterns across samples, suggesting coordinated biological functions or responses. Clustering analysis serves as a powerful computational microscope that brings these hidden patterns
//
-

How To Analyze Whole Exome Sequencing Data For Absolute Beginners: From Raw Reads to High-Quality Variants, Mutations, and CNVs
Introduction: Understanding Whole Exome Sequencing vs. Whole Genome Sequencing What is Whole Exome Sequencing (WES)? Whole Exome Sequencing (WES) is a targeted sequencing approach that focuses specifically on the protein-coding regions of the genome, known as exons. While the human genome contains approximately 3.2 billion base pairs, the exome represents only about 1-2% of this
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6-2: Identifying Tumor Copy Number Variants Using CNVkit
Introduction: Understanding Tumor Copy Number Variants Cancer is fundamentally a disease of genomic instability, where normal cells accumulate mutations that drive uncontrolled growth and metastasis. Among these mutations, somatic copy number alterations (SCNAs) – also known as tumor CNVs – play a pivotal role in cancer initiation, progression, and treatment resistance. This tutorial builds upon
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 6: Identifying Germline Copy Number Variants
Introduction: Understanding Copy Number Variants While single nucleotide variants (SNVs) and small insertions/deletions (indels) capture much of the attention in genomic analysis, they represent only part of the story of human genetic variation. Copy Number Variants (CNVs) – duplications and deletions of large segments of DNA – play an equally important role in human genetics,
//
-

How to Perform Master Regulator Analysis on RNA-seq Data Using RegEnrich and RTN – A Complete Beginner’s Guide
Discover the transcription factors controlling gene expression changes in your RNA-seq experiments Introduction: Understanding Master Regulator Analysis In the intricate symphony of gene regulation, not all transcription factors play equal roles. Some act as “master regulators” – key transcription factors that orchestrate broad changes in gene expression programs, controlling entire networks of downstream genes. Identifying
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 5: Identifying Disease- or Patient-Specific Variants
Introduction: From Variants to Disease Genes After successfully calling variants in your whole genome sequencing samples (as covered in Part 1 of this series), you now face an exciting challenge: among the millions of genetic variants present in the human genome, which ones are actually responsible for disease? Every human genome contains approximately 4-5 million
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 4: Visualizing and Interpreting Somatic Mutations
Introduction: From Multiple VCF Files to Biological Insights This tutorial builds upon our previous whole genome sequencing analysis pipeline, specifically the mutation calling results from Part 2A: Matched Tumor-Normal Mutation Calling with Mutect2. You should now have multiple high-confidence VCF files from different tumor-normal pairs that need to be converted to MAF (Mutation Annotation Format)
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 3: Annotating SNVs and Mutations with Multiple Tools
A comprehensive step-by-step guide to understanding the functional impact of genomic variants using GATK Funcotator, Ensembl VEP, SnpEff, and ANNOVAR Introduction: From Variants to Biological Meaning After successfully identifying genomic variants using GATK (covered in Part 1) and discovering somatic mutations with Mutect2 (detailed in Part 2A), you now have VCF files containing thousands of
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 2B: Unmatched Sample Mutation Calling Strategies
Introduction: Real-World Mutation Calling Challenges Welcome to Part 2B of our somatic mutation analysis series! In Part 2A, we learned the gold standard approach using matched tumor-normal pairs. However, in real-world scenarios, you often face situations where matched normal samples aren’t available. Common Unmatched Sample Scenarios Clinical Archives: Historical tumor samples without corresponding normal tissuePopulation
//
-

How To Analyze Whole Genome Sequencing Data For Absolute Beginners Part 2A: Matched Tumor-Normal Mutation Calling With Mutect2
Introduction to Matched Tumor-Normal Analysis Welcome back to our whole genome sequencing analysis journey! In Part 1, we learned how to process raw sequencing data and identify germline variants using GATK’s best practices. Now we’re ready to tackle the gold standard approach for detecting somatic mutations: matched tumor-normal analysis. What Are Somatic Mutations? Somatic mutations
//
Search
Subscribe
Categories
- bulk RNA-seq (27)
- chromatin accessibility (14)
- Database (4)
- Epigenetics (14)
- Genomics (10)
- HPC (5)
- Metagenomics (1)
- Quick Tips (1)
- RNA-seq (14)
- Scientific Programming (5)
- Single Cell Sequencing (14)
- Transcriptomics (28)
Recent Posts
- How to Analyze Single-Cell RNA-seq Data – Complete Beginner’s Guide Part 12: Build Gene Co-expression Networks Using hdWGCNA
- How to Analyze Single-Cell RNA-seq Data — Complete Beginner’s Guide Part 11: Copy Number Variation Analysis Using CopyKAT
- No More Command-Line Only: Run Jupyter Lab, RStudio, and VS Code Interactively in Your Browser on Any HPC Cluster with Pixi
- How to Analyze Single-Cell RNA-seq Data – Complete Beginner’s Guide Part 10: Cell-Cell Communication Analysis Using NicheNet
Tags
Alternative Splicing Analysis ATAC-seq BAM ChIP-seq chromatin accessibility CNV DESeq2 Differential Expression edgeR FASTQ GATK Mutect2 gene expression heatmap HOMER HPC Isoform limma MACS2 MAF miRNA miRNA-seq MSigDB Normalization peak calling RNA-seq SLURM somatic mutations Transcript VCF whole genome sequencing



