4.12 Summary

Before proceeding to differential expression analysis, confirm all QC checks pass:

Check Expected outcome This dataset
Size factors ≈ 1.0 across samples Library sizes are balanced ⚠️ C3 = 2.324 — corrected by normalisation
Boxplots of normalised counts overlap No extreme outlier samples ✅ All samples overlap after normalisation
Correlation heatmap: within-group distances < between-group Replicates are reproducible ✅ Clean block structure
PCA PC1 separates conditions Condition is the dominant source of variance ✅ PC1 = 85.4%, perfect separation
No isolated samples in PCA or heatmap No technical outliers ⚠️ C3 offset on PC2 — consistent, not alarming

Overall the dataset passes QC. C3 shows a higher sequencing depth and mild transcriptional offset from C1/C2, visible consistently across all QC plots. This is within acceptable range for biological replicates and does not compromise the downstream analysis. We proceed to differential expression.

⭐ Important: If any check fails, investigate the cause before running DESeq2. Proceeding with outlier samples or confounded designs will compromise all downstream results

# ── Export DDS for downstream analysis ────────────────────────────────────────
results_dir <- file.path(git_root, "results", "rds")
dir.create(results_dir, recursive = TRUE, showWarnings = FALSE)

dds_path <- file.path(results_dir, "dds_ecoli_MG1655.rds")
saveRDS(dds, file = dds_path)
cat("✅ DDS saved to:", dds_path, "\n")
cat("   Dimensions  :", nrow(dds), "genes ×", ncol(dds), "samples\n")
cat("   Conditions  :", paste(levels(dds$condition), collapse = " vs "), "\n")


sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sonoma 14.3
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Copenhagen
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] heatmaply_1.6.0             viridis_0.6.5              
##  [3] viridisLite_0.4.3           KEGGREST_1.46.0            
##  [5] fgsea_1.32.4                mulea_1.1.1                
##  [7] plotly_4.12.0               DT_0.34.0                  
##  [9] kableExtra_1.4.0            knitr_1.51                 
## [11] factoextra_2.0.0            pheatmap_1.0.13            
## [13] RColorBrewer_1.1-3          ggpubr_0.6.3               
## [15] DESeq2_1.46.0               SummarizedExperiment_1.36.0
## [17] Biobase_2.66.0              MatrixGenerics_1.18.1      
## [19] matrixStats_1.5.0           GenomicRanges_1.58.0       
## [21] GenomeInfoDb_1.42.3         IRanges_2.40.1             
## [23] S4Vectors_0.44.0            BiocGenerics_0.52.0        
## [25] reshape2_1.4.5              lubridate_1.9.5            
## [27] forcats_1.0.1               stringr_1.6.0              
## [29] dplyr_1.2.1                 purrr_1.2.2                
## [31] readr_2.2.0                 tidyr_1.3.2                
## [33] tibble_3.3.1                ggplot2_4.0.3              
## [35] tidyverse_2.0.0            
## 
## loaded via a namespace (and not attached):
##  [1] gridExtra_2.3           rlang_1.2.0             magrittr_2.0.5         
##  [4] otel_0.2.0              compiler_4.4.1          png_0.1-9              
##  [7] systemfonts_1.3.2       vctrs_0.7.3             pkgconfig_2.0.3        
## [10] crayon_1.5.3            fastmap_1.2.0           backports_1.5.1        
## [13] XVector_0.46.0          labeling_0.4.3          ca_0.71.1              
## [16] rmarkdown_2.31          tzdb_0.5.0              UCSC.utils_1.2.0       
## [19] ragg_1.5.2              xfun_0.57               zlibbioc_1.52.0        
## [22] cachem_1.1.0            jsonlite_2.0.0          DelayedArray_0.32.0    
## [25] BiocParallel_1.40.2     broom_1.0.12            parallel_4.4.1         
## [28] R6_2.6.1                bslib_0.10.0            stringi_1.8.7          
## [31] car_3.1-5               numDeriv_2016.8-1.1     jquerylib_0.1.4        
## [34] iterators_1.0.14        assertthat_0.2.1        Rcpp_1.1.1-1.1         
## [37] bookdown_0.46           Matrix_1.7-5            timechange_0.4.0       
## [40] tidyselect_1.2.1        rstudioapi_0.18.0       abind_1.4-8            
## [43] yaml_2.3.12             TSP_1.2.7               codetools_0.2-20       
## [46] lattice_0.22-9          plyr_1.8.9              withr_3.0.2            
## [49] S7_0.2.2                coda_0.19-4.1           evaluate_1.0.5         
## [52] xml2_1.5.2              Biostrings_2.74.1       pillar_1.11.1          
## [55] carData_3.0-6           foreach_1.5.2           generics_0.1.4         
## [58] emdbook_1.3.14          hms_1.1.4               scales_1.4.0           
## [61] glue_1.8.1              lazyeval_0.2.3          tools_4.4.1            
## [64] apeglm_1.28.0           dendextend_1.19.1       data.table_1.18.2.1    
## [67] webshot_0.5.5           locfit_1.5-9.12         ggsignif_0.6.4         
## [70] registry_0.5-1          mvtnorm_1.3-6           fastmatch_1.1-8        
## [73] cowplot_1.2.0           grid_4.4.1              bbmle_1.0.25.1         
## [76] seriation_1.5.8         crosstalk_1.2.2         bdsmatrix_1.3-7        
## [79] colorspace_2.1-2        GenomeInfoDbData_1.2.13 Formula_1.2-5          
## [82] cli_3.6.6               textshaping_1.0.5       S4Arrays_1.6.0         
## [85] svglite_2.2.2           gtable_0.3.6            rstatix_0.7.3          
## [88] sass_0.4.10             digest_0.6.39           SparseArray_1.6.2      
## [91] ggrepel_0.9.8           htmlwidgets_1.6.4       farver_2.1.2           
## [94] htmltools_0.5.9         lifecycle_1.0.5         httr_1.4.8             
## [97] MASS_7.3-65