4.1 Differential Expression Analysis

⚠️ Warning: Always use raw integer counts as input. Do not use TPM, FPKM, or any normalised values — DESeq2 handles normalisation internally.

DESeq2 fits a negative binomial model to the raw counts and performs a Wald test for each gene. Internally it runs three steps in sequence: size factor estimation (normalisation), dispersion estimation, and the Wald test. These can also be run separately, but DESeq() handles all three in one call.

git_root <- system("git rev-parse --show-toplevel", intern = TRUE)

dds <- readRDS(file.path(git_root, "results", "rds", "dds_ecoli_MG1655.rds"))

cat("✅ DDS loaded\n")
## ✅ DDS loaded
cat("   Dimensions  :", nrow(dds), "genes ×", ncol(dds), "samples\n")
##    Dimensions  : 3698 genes × 6 samples
cat("   Conditions  :", paste(levels(dds$condition), collapse = " vs "), "\n")
##    Conditions  : control vs treatment

📌 Remember: The variable of interest should be at the end of the design formula, and the control group must be the first (reference) factor level. Both were set in the QC script.

Check the results

dds <- DESeq(dds)
resultsNames(dds)
## [1] "Intercept"                      "condition_treatment_vs_control"