3 Running the nf-core/rnaseq pipeline v3.23.0
We will not process the original dataset because more computing power would be required. Instead, we will process only two size-reduced samples where 50,000 reads were randomly sampled from the original data (used script for sub-sampling: util/subsample_50k_PRJNA1158806.sh).
We will use pipeline release v3.23.0. From this release, it is possible to use Bowtie2 for read alignment, which is advantageous for processing prokaryotic RNAseq data. For simplicity, we specify a “prokaryotic” profile which automatically uses Bowtie2 for read alignment and Salmon for read quantification.
To process the data, run the following command:
nextflow run 'https://github.com/nf-core/rnaseq' \
-name 'Ecoli_MG1655_saccharin_2_samples' \
--outdir '/workspaces/dsp_transcriptomics_training/results/nfcore_rnaseq_processing_subsampled' \
--input '/workspaces/dsp_transcriptomics_training/data/seq_files_subsampled/samplesheet_50k_subsampled_2samples.csv' \
--fasta '/workspaces/dsp_transcriptomics_training/data/genome_files/GCF_000005845.2_ASM584v2_genomic.fna.gz' \
--gtf '/workspaces/dsp_transcriptomics_training/data/genome_files/GCF_000005845.2_ASM584v2_genomic.gtf.gz' \
-r 3.23.0 \
-profile prokaryotic,docker \
-c /workspaces/dsp_transcriptomics_training/01_scripts/custom.configParameter descriptions:
-name: name of the processing run--outdir: (absolute) path to the output directory where results will be saved (the sub-directory is created automatically)--input: (absolute) path to the sample sheet--fasta: (absolute) path to the gzipped genome FASTA file--gtf: (absolute) path to the gzipped genome annotation file-r: nf-core/rnaseq pipeline release/version-profile: profile(s) to run; here “prokaryotic” mode using “docker”-c: (absolute) path to the custom configuration file; used here to limit the number of CPUs and memory
The Nextflow command is stored in a bash script and can be executed by running:
The processing time is about 7 minutes.
📌 Remember: Rather than configuring parameters manually, nf-core offers automatic parameter configuration. Go to the pipeline page and press Launch version 3.23.0 to see all pipeline parameters and change them as needed. A configuration file can be generated automatically for use with your Nextflow command.