This tutorial is done on Puhti and its web interface, which requires that:
đź’¬ In this tutorial, you will learn how to:
We can use a data science Jupyter notebook managed by the Jupyter project as an example here. One can find a ready-made Docker container image in Docker registries such as Docker Hub and Red Hat Quay. On these registries, you can search with the keyword “data sciene” and then select “jupyter/datascience-notebook” to see the images for Jupyter notebook.
While the container images can be pulled and used directly as singularity containers on HPC systems, CSC’s Tykky container wrapper serves as an easy installation method for containers. For the purpose of this tutorial, we will use a data science docker image from Docker Hub. You can install Jupyter notebook environment to /projappl directory as below:
# Navigate to /projappl area of your course project
cd /projappl/project_20xxxx/ # Make sure to replace 20xxxx with correct course project number
module purge # Clean your environment
module load tykky # load the Tykky container wrapper
mkdir -p /projappl/project_20xxxx/$USER && mkdir -p /projappl/project_20xxxx/$USER/Notebook
# You can use the wrap-container command from tykky module to install image binaries to /projappl
wrap-container -w /opt/conda/bin docker://docker.io/jupyter/datascience-notebook:x86_64-ubuntu-22.04 --prefix /projappl/project_200xxxx/$USER/Notebook
# This installation can take for a while
# The -w option specifies the installation directory inside the container. For this data science container image, path is /opt/conda/bin
# The --prefix option specifies the directory where we want to install the software on the host system.
Upon succesful installation, the executables of the Jupyter notebook will be available in the directory /projappl/project_200xxxx/$USER/Notebook/bin.
You can download an example Python notebook to perform basic data analysis tasks inside the installed Jupyter notebook as below:
cd /scratch/project_200xxxx/$USER
wget https://a3s.fi/biocontainers2024/course_notebook.tar.gz
tar -xavf course_notebook.tar.gz
Once login is successful, select “Jupyter” icon from the pinned apps on the landing page. Then open the Jupyter notebook and use the following settings:
Reservation: use course reservation if available.
Project: project_2003682
Partition: small
Number of CPU cores: 2
Memory (Gb): 2
Local disk: 0
Time: 0:45:00
Python: custom path
Custom Python interpreter: /projappl/project_2003682/$USER/Notebook/bin/python (please replace $USER with your CSC username)
Working directory: /scratch/project_2003682/
and finally “Launch” notebook
‼️ Please note that the course reservation (name: container_course) field on Puhti web interface will only on the course day(s) for the members of course project