RNA-seq Analysis
for Absolute Beginners
A 7-session live workshop that takes you from zero command-line experience to a complete, publication-ready RNA-seq analysis pipeline โ step by step, with your instructor’s hands guiding every line of code.
NGS data analysis is no longer optional
Whether you’re in academia or industry, the ability to analyze your own sequencing data is rapidly becoming a baseline expectation โ not a bonus skill.
You understand your biology โ own your data too
RNA-seq has become routine in labs of every size. No external bioinformatician knows your experimental system, your controls, or your biological hypotheses the way you do. Analyzing your own data doesn’t just save time โ it generates deeper, more accurate insights.
Wet lab + dry lab = an unfair advantage
Top pharmaceutical and biotech companies actively seek researchers who can move fluidly between bench and computation. This hybrid profile is rare, commands higher salaries, and opens doors that remain closed to specialists on either side alone.
The field has moved. Don’t get left behind.
NGS data is now universal across life sciences. Labs without dedicated bioinformaticians are competing for collaborators, waiting months for results, and missing publication deadlines. Knowing how to handle this data is simply part of being a competitive researcher today.
I spent years lost in the dark.
You don’t have to.
I was a wet lab researcher in graduate school when RNA-seq was cutting-edge. My lab sequenced a non-model animal’s transcriptome โ and nobody in the entire department knew how to analyze the data. I had to beg someone in the medical school for help. They took forever, and eventually told me they couldn’t do it.
That moment changed my career. I swore I would learn to analyze NGS data independently. But it wasn’t easy. I didn’t know where to start. I spent years in the dark, Googling desperately, piecing together fragments from tutorials that all assumed prerequisites I didn’t have. It took me years to find the actual learning path.
Today, I’m a computational biologist who has collaborated with hundreds of wet-lab researchers. And I see the same frustration I felt โ every single week. Researchers waiting months for results. PIs unable to evaluate their own data. Students stuck before they even get started.
I built NGS101.com to help. Thousands of researchers use my tutorials every month. But email after email told me the same thing: even with detailed written tutorials, beginners still couldn’t find a clear place to start. So I built this workshop โ to do what a written guide never can: walk alongside you, in real time, step by step.
Why beginners get stuck
The biology isn’t the hard part. Here’s what actually blocks researchers from getting started.
The Linux command line โ before a single analysis even begins
Almost every bioinformatics tool runs in a Linux terminal. Most tutorials skip this entirely, assuming you already know it. Most beginners don’t โ and get stuck immediately.
Tool installation feels like black magic
Setting up STAR, featureCounts, or Salmon from scratch involves dependency management, PATH variables, and environment configuration โ all opaque to someone just trying to analyze RNA-seq data.
Personal computers simply can’t handle the compute
Genome indexing, alignment, and quantification require significant RAM and storage. These jobs need a server or HPC โ and most beginners have no path to one.
A maze of file formats with no map
FASTQ, BAM, SAM, GTF, BED, VCF โ knowing which tool requires which format, and how to convert between them, is genuinely confusing when you’re starting out.
Written tutorials have no “start here” arrow
Even comprehensive tutorials can overwhelm a beginner who doesn’t know which section to read first, what to skip, or how to connect the pieces into a working pipeline.
Everything designed
for the absolute beginner
Every obstacle above has a specific solution built into this workshop. Here’s how.
Pre-configured cloud environment โ just log in
No installation, no local setup, no HPC account needed. I provide a ready-to-go Linux environment in the cloud. On Day 1, you’re already running real commands on real data.
Linux from zero โ live, guided practice
We start with “how to open a terminal” and build up. Every command is explained, practiced, and applied in the context of bioinformatics. You’ll be fluent in the commands that actually matter.
Real data formats, demystified live
I’ll open and explain every file format you’ll encounter โ FASTQ quality scores, SAM/BAM alignment files, GTF annotations โ using real sequencing data, not abstract examples.
Lean R โ only what you need for RNA-seq
Most R courses try to teach the whole language. We won’t. We cover exactly the R you need for differential expression analysis โ nothing more, nothing less. Efficient and immediately applicable.
Ready-to-use scripts for your own data
You’ll leave with annotated, working scripts covering the complete RNA-seq pipeline โ code you can run on your own data starting the day after the workshop ends.
Lifetime recordings + 1 month of email support
Every session is recorded. Re-watch any step as many times as needed. Plus one month of direct email support while you apply what you learned to your own data.
Seven sessions. One complete pipeline.
From opening a terminal for the first time to submitting your data to NCBI GEO โ every step, every tool, every concept.
Linux from scratch + environment setup
Your first time in the terminal โ we start here. Learn to navigate the Linux command line, organize a project directory, and install bioinformatics tools using conda. By the end, the terminal feels powerful, not scary.
From raw reads to your first count matrix
Learn to read FASTQ files, interpret Phred quality scores, and explore SAM/BAM and GTF formats. Run FastQC to assess data quality, then move straight into alignment: build a STAR genome index, map your samples to the reference genome, and run featureCounts to generate the count matrix that drives all downstream analysis.
The R you actually need โ nothing more
Transition from Linux to R. Learn RStudio, R data structures, and Bioconductor package management โ only what’s needed for RNA-seq. Import and normalize your count matrix, and understand why raw counts can’t be compared directly.
Statistical analysis โ the session you’ve been working toward
Run differential expression analysis with limma, perform PCA and sample QC, interpret logFC and adjusted p-values, and extract your DEG list. Compare DESeq2, edgeR, and limma-voom so you can choose the right tool for your data.
Publication-ready figures and biological meaning
Create volcano plots, heatmaps, PCA plots, and MA plots using ggplot2. Then translate DEGs into biology: GO enrichment, KEGG pathways, and GSEA. Build the figures and narrative that go directly into your paper.
Batch effects and complex experimental designs
Real data is messy. Learn to detect and visualize batch effects using PCA, adjust your statistical model to account for known covariates, apply ComBat for batch correction when appropriate, and design multi-factor experiments properly โ including paired samples and blocking factors.
Cell-type deconvolution and NCBI GEO submission
Understand the limits of bulk RNA-seq and use reference-based deconvolution to estimate cell-type composition changes between conditions. Then close the loop on publication: prepare metadata, navigate the NCBI GEO submission portal step by step, and leave with a reproducible analysis checklist you can apply to every future project.
Everything you need,
nothing you don’t
One enrollment covers the full workshop experience โ tools, support, and materials.
7 live sessions with Dr. Guo~2 hours each, hands-on from minute one
Pre-configured cloud environmentLog in and start coding โ no local setup
Lifetime access to all recordingsRe-watch any session whenever you need
Complete, annotated code repositoryAll scripts, ready to run on your own data
Real-world practice datasetsRepresentative of actual research projects
1 month of email supportGet help applying the skills to your data
All slides and teaching materialsKeep them for future reference
Certificate of completionDocumenting your training hours
Dr. Lei Guo
Computational Biologist ยท UT Southwestern Medical Center ยท Founder, NGS101.com
Dr. Guo is a computational biologist with over a decade of experience in genomic data analysis and a researcher-turned-educator who has made it his mission to demystify bioinformatics for the life science community.
As the founder of NGS101.com, Dr. Guo has built a library of 70+ in-depth tutorials covering RNA-seq, single-cell analysis, epigenetics, Hi-C, ATAC-seq, DNA methylation, and whole-genome/whole-exome sequencing (WGS/WES) โ helping thousands of researchers worldwide analyze their own NGS data every month.
His teaching philosophy is simple: no step is too small to explain. He teaches the way he wishes someone had taught him โ with clarity, context, and zero assumption of prior knowledge.
Stop waiting for a bioinformatician.
Become one.
Cohort size is intentionally limited. Every participant gets real attention, real feedback, and a real learning experience.
Session time may be adjusted based on the time zones of enrolled participants.
โ Registration closes April 25, 2026
Group rates available for 3+ participants from the same institution. Questions? Contact Dr. Guo directly.