Platform#
Seqera platform (cloud) allows you to run and monitor nextflow pipelines easily. Parameters can be set using a web interface and pipelines can be launched from there.
See docs.seqera.io/.
Pipeline parameters in Seqera#
How can directories be set by a browse button?
nf-schema
sets this up saving it in a autogenerated nextflow_schema.json
which
is part of the root directory of a pipeline repository.
Lauching pipelines#
Make sure to use a batch account with priority VMs to launch a pipeline and have cheaper ephermal (non-priority) VMs for the actual pipeline execution of single jobs.
Each pool only has one VM size and type.
For for complicated setups with different VM
sizes and types, multiple pools need to be specified (e.g. using process labels using withLabel
).
However, as more than one job can run on a big node (which is a single VM) defining
non too extensive resources for a process is a good way to economically distribute jobs
on few nodes. This is very similar to computerome2 if you are familiar with that HPC.
Overview of what happens when you launch a pipeline on Seqera:
Seqera Platform authenticates to Azure Batch and Storage as a service principal
It adds a task to Azure Batch which runs Nextflow
Nextflow starts on a compute node with an attached managed identity, this is used by Nextflow to authenticate and start adding jobs and tasks to Azure Batch
An Azure Storage Container bucket is used as an intermediate working directory for the pipeline
⚠️ unclear if a job fails how the local storage is handled of the node (VM). Intermediate data might not be copied back to the storage container.
Each task on Azure Batch (including the Nextflow task) will pull a Docker container from a registry as well as remote storage which might be located
The node communicates back to Seqera Platform with the current status of the pipeline including completion
Seqera Platform will access the logs and results located on Azure Storage, using the service principal for authentication
Using seqera from the command line - tower cli#
Check out if you the tw
cli tool if you want to automate runs without using the
web interface at github.com/seqeralabs/tower-cli