In this tutorial, you will learn nextflow pipeline that uses
Containerised applications are highly portable and reproducible for scientific applications. Fortunately, Nextflow smoothly supports integration with popular containers ( e.g., Docker and Singularity) to provide a light-weight virtualisation layer for running software applications. Please note that you can only work with Singularity containers on Puhti as docker containers require prevelized access which CSC users don’t have it on Puhti.
Here we use Sarek workflow from nf-core community.
In this tutorial, we will use Puhti supercomputer. First login to Puhti using SSH (or by opening a login node shell in the Puhti web interface):
ssh <username>@puhti.csc.fi # replace <username> with your CSC username, e.g. myname@puhti.csc.fi
And go to scratch directory to submit nextflow job:
mkdir -p /scratch/<project>/$USER/ # replace <project> with your CSC project, e.g. project_2001234
cd /scratch/<project>/$USER/ && mkdir -p nf-core && cd nf-core
Here is an example batch script to run the pipeline on Puhti:
#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --partition=small
#SBATCH --account=project_xxxx
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=4000
export SINGULARITY_TMPDIR=$PWD
export SINGULARITY_CACHEDIR=$PWD
unset XDG_RUNTIME_DIR
# Activate Nextflow on Puhti
module load nextflow/22.10.1
# nf-core pipeline examples here
# Variant calling on genome data
nextflow run nf-core/sarek -r 3.1.1 -profile test,singularity -resume
# proteomics example
# nextflow run nf-core/proteomicslfq -r 1.0.0 -profile test,singularity -resume
# metabolomics example
# nextflow run nf-core/metaboigniter -r 1.0.1 -profile test,singularity -resume
copy and paste the above script to a file named sarek_nfcore.sh and replace your project number with project_xxxx in slurm directives.
Finally, submit your job
sbatch sarek_nfcore.sh
You can check of the status of your pipeline with the following command:
squeue -u $USER
Think of the following aspects of pipeline: