Building a custom nf-core analysis pipeline#
These descriptions are based on the custom pipeline dsp_demo_nf_acore_vuegen which is used to highlight how an analysis notebook based on acore can be integrated with a report based on vuegen.
It is based on the nf-core tools and their template for a nextflow repository, see:
nf-core/tools.
Using the template#
The instructions are brief on their website. Follow the command line instructions. Per default it is based on genomics data showing a fastqc analysis pipeline.
can be customized to have a prefix, e.g.
dsp-instead ofnf-core-. See customization options ofnf-core pipelines createcommand. Set--organization dspto havedsp-prefix.pipeline structure explained here
an input schema (e.g. a SDRF file) can be defined using the schema-tutorial. The default pipeline has as single input a csv sample sheet, and an output directory.
Make adaptations to the pipeline created from the template#
edit the schema and remove the parameters which are not needed:
nf-core pipelines schema buildthen try to get it to run
Add modules and subworkflows#
Using the nf-core modules or nf-core subworkflows command, you can add modules
or entire subworkflows to your pipeline, enabling you to augment pre-existing pipelines
with new functionality before or after the existing workflow.
Deviate from existing modules (patch)#
Patching allows you to create a modified version of an existing module, which can be
useful if you want to make small changes to an existing module without having to create
a new one from scratch. You can use the nf-core modules patch command to create a
patch for an existing module, which will allow you to modify the module’s code while
still keeping track of the original version and allowing you to still update with new
changes.
make custom adjustments
still have the option to incorporate updates from the original module (or subworkflow)
For example for thermorawfileparser module in bigbio/nf-modules, you can
pull install and patch and then update it with the latest changes:
# Apply a "patch" to the installed module
# This creates a local editable version while keeping a reference to the upstream source
# Any changes you make will be tracked as a patch (diff) on top of the original module
nf-core modules --git-remote https://github.com/bigbio/nf-modules.git install thermorawfileparser
nf-core modules --git-remote https://github.com/bigbio/nf-modules.git patch thermorawfileparser
# a while later after updates were made to bigbio/nf-modules/thermorawfileparser,
# you can pull the latest changes and update your patched version:
# Your local modifications (patch) will be re-applied on top of the updated module
# This helps to keep your custom changes while staying in sync with upstream improvements
nf-core modules --git-remote https://github.com/bigbio/nf-modules.git update thermorawfileparser
On patching
Lint#
Check for errors and warnings:
nf-core pipelines lint .
Test#
You will need to add basic test of the pipeline.
Wave#
Can be used to auto-generate containers for workflow runs (if conda is not available).
from conda environment to containerized version
use
waveprofile from nf-core template config (will deactivate any pre-built docker or singularity containers)allows also to specify a custom container registry, e.g.
ghcr.io/biosustainfor privately hosted containers in Seqera Platform (which executes it then on Azure), see here.
Define the report path#
The file has to be linked in a process output explicitly, not just the folder, i.e.
reportsas output path would not display a report inreports/myreport.html, butreports/myreport*would.
Custom report can be added, e.g. from VueGen, to the
reports tab.
in Seqera Cloud using the tower.yml configuration file.
# tower.yml
reports:
multiqc_report.html:
display: "MultiQC HTML report"
quarto_report.html:
display: "VueGen HTML report"
Hints#
the example was moved from a course repository to its separate repository dsp_demo_nf_acore_vuegen in order to make it executable on Seqera Cloud. The initial history can be found here.
Commit without formatting errors#
use the
pre-commithooks for formatting on all files:pip install pre-commit # Installs the Git hooks defined in .pre-commit-config.yaml # This sets up automatic checks that run before every commit pre-commit install # then only this is needed after installing the hooks: # Useful for initial cleanup or when introducing pre-commit to an existing repo # Also helpful to manually re-run checks without making a commit pre-commit run --all-files