Singularity

Overview

Containers allow users to bundle their application in a lightweight package along with everything required to run it, including code, software dependencies, system libraries, and configuration files. Docker is a popular container framework, but it is not suitable for HPC, due to its privileged permissions and separate daemon process. Singularity is a container framework specifically designed for HPC platforms. Singularity runs the container in user space, without any possibility of privilege escalation inside the container. Singularity also provides us the ability to convert existing Docker containers to Singularity image format, or to run containers directly from public repositories such as Docker Hub.

When running a compute job that does not use containers, you are running in what we’ll refer to as the host environment. This includes the operating system, libraries, devices, and network for the compute node. When running a compute job within a container, you are instead running in the container’s environment. However, some aspects of the host environment are reflected into every container. This includes:

  • All mount points to HPC storage (/home, /data, /opt/apps, /scratch)
  • /dev (including Infiniband and GPU devices)

Quick Start

Request a compute node using idev. singularity pull command can be used to download pre-built images from resources like the Docker Hub or Singularity Hub.
Set umask u=rwx,g=rx,o=rx before pulling an image to allow “other” read access.

$ idev -N 1 --ntasks=1 -t 90
$ module load singularity/3.0.0 
$ umask u=rwx,g=rx,o=rx
$ singularity pull shub://vsoch/hello-world
$ singularity pull docker://ubuntu:latest 

Run the container’s runscript using run command or directly as an executable file.

$ singularity run hello-world_latest.sif
$ ./hello-world_latest.sif

We can also execute a specific command within a container using exec command.

$ singularity exec hello_world_latest.sif cat /etc/os-release

To run an interactive shell in a container, use the shell command.

$ singularity shell hello_world_latest.sif

To run GPU-enabled containers, add --nv option to exec or run commands.

$ idev -N 1 --ntasks=1 --gres=gpu:1 -t 90
$ git clone https://github.com/tensorflow/models.git
$ singularity exec --nv docker://tensorflow/tensorflow:latest-gpu python ./models/tutorials/image/mnist/convolutional.py

Running containers with MPI support using Singularity is more involved. See MPI support section below for more discussion.
Singularity commands can also be run using a batch script. See batch mode section below for a sample job script.

Singularity Basics

This section contains all the necessary information in order to get started with Singularity. User input is shown in bold.

Commonly used commands

Below are the most commonly used Singularity commands. See Singularity documentation for all available commands.
help Help about any command
build Build a new Singularity container
exec Execute a command within container
inspect Display metadata for container if available
pull Pull a container from a URI
run Launch a runscript within container
run-help Display help for container if available
shell Run a Bourne shell within container
test Run defined tests for this particular container

Examples:

$ singularity help
$ singularity help <command>
$ singularity shell docker://ubuntu:latest
$ singularity exec docker://ubuntu:latest cat /etc/os-release

Obtaining container images

pull and build commands can be used to download pre-built images from resources like the Docker Hub or Singularity Container Library. For Singularity v3.0 the default image format is Singularity Image File (SIF). build requires specifying a name for the container, whereas, it is optional while downloading a container using pull. Note that all the examples in this guide are run on a compute node within an interactive job.
To start with, let’s look around the host environment for a moment. We can see compute nodes (as of this writing) run CentOS 7 on a 3.10.0-693 kernel.

$ uname -r
3.10.0-693.11.6.el7.x86_64

$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

We’ll also create a scratch workspace to work within, but you can use any storage you have access to.

$ mkworkspace 
Successfully created workspace.  Details:
    Workspace: /path/to/workspace
    User: user.name
    Group: its_p_sys_ur_kam-mygroup
    Expiration: 2018-12-28 14:55:28.38494

$ cd /path/to/workspace

Now we can load the singularity module and download pre-built images using either pull or build.
NOTE: Set umask u=rwx,g=rx,o=rx before pulling or building an image to allow “other” read access, otherwise you will get an error No valid /bin/sh in container

$ module load singularity/3.0.0
$ umask u=rwx,g=rx,o=rx
$ singularity pull docker://ubuntu:latest # with default ubuntu_latest.sif name
$ singularity pull ubuntu.sif docker://ubuntu:latest # with custom ubuntu.sif name
$ singularity build ubuntu.sif library://ubuntu

Running containers

There are two ways in which we can run Singularity containers. First, interactive mode, where we run a container interactively on a compute node; and second, batch mode, where we run a container as a part of a batch script.

Interactive mode
$ module load singularity/3.0.0
$ singularity pull docker://ubuntu:latest
WARNING: Authentication token file not found : Only pulls of public images will succeed
INFO:    Starting build...
Getting image source signatures
Copying blob sha256:32802c0cfa4defde2981bec336096350d0bb490469c494e21f678b1dcf6d831f
 30.62 MiB / 30.62 MiB [====================================================] 0s
...
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: ubuntu_latest.sif 

$ singularity shell ubuntu_latest.sif 
Singularity ubuntu_latest.sif:/path/to/workspace>

Notice the change in the command prompt, indicating we are now running within the container. Let’s compare it to the host environment and verify we’re really in an Ubuntu container.

Singularity ubuntu-latest.sif:/path/to/workspace> uname -r
3.10.0-693.11.6.el7.x86_64

Singularity ubuntu-latest.sif:/path/to/workspace> cat /etc/os-release 
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Exit the shell with exit to return to the host environment. Notice that while in the container we appear to be running Ubuntu – because we are – but we are still running on the host’s kernel since a container is not a virtual machine.

You can also directly interact with image URIs. This creates an ephemeral container that disappears when the shell is exited.

$ singularity shell docker://ubuntu:latest
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Skipping fetch of repeat blob sha256:32802c0cfa4defde2981bec336096350d0bb490469c494e21f678b1dcf6d831f
...
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: /home/user.name/.singularity/cache/oci-tmp/6d0e0c26489e33f5a6f0020edface2727db9489744ecc9b4f50c7fa671f23c49/ubuntu_latest.sif
INFO:    Image cached as SIF at /home/user.name/.singularity/cache/oci-tmp/6d0e0c26489e33f5a6f0020edface2727db9489744ecc9b4f50c7fa671f23c49/ubuntu_latest.sif

Singularity ubuntu_latest.sif:/path/to/workspace> cat /etc/os-release 
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Singularity ubuntu_latest.sif:/path/to/workspace> exit
exit
$

We can also execute a specific command within a container using exec command:

$ singularity exec ubuntu_latest.sif cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.1 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.1 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Batch mode

Let’s see how we can use the downloaded Ubuntu image in a compute job. Create a file with the following submission script and submit it with sbatch.

#!/bin/bash
#SBATCH -n 1        # Number of cores
#SBATCH -t 0-00:05  # Runtime in D-HH:MM
#SBATCH --job-name=singularity

echo "Starting singularity on host $HOSTNAME"

image=/path/to/workspace/ubuntu_latest.sif

module load singularity/3.0.0

singularity exec $image ls $HOME

echo "Completed singularity on host $HOSTNAME"

The job’s output file should contain a listing of the contents of your home directory.

Binding Directories

Singularity allows you to access the directories on the host system from within the container using bind mounts. On the HPC cluster, several directories such as /home, /data, /scratch, and /tmp are included inside each container by default. You can also bind other directories into your container using the --bind/ -B command-line option or $SINGULARITY_BINDPATH environment variable. Mount options may be specified as ro (read-only) or rw (read/write, which is the default).

The following example shows how to bind the /opt on the host to /opt in the container in read-only mode and /data on the host to /mnt in the container using --bind command-line option:

$ singularity shell --bind /opt:/opt:ro,/data:/mnt container.sif

or using the environment variable:

$ export SINGULARITY_BINDPATH="/opt:/opt:ro,/data:/mnt"
$ singularity shell container.sif

Containers with MPI support

Singularity was developed to run containers on HPC platforms and supports different MPI implementations (such as Intel MPI, MVAPICH2, OpenMPI). It has been designed with out of the box compatibility with Open MPI (v2.1.x), but is known to work well with other MPI implementations as well. There are few points, however, that should be kept in mind before running a container with MPI:

  • MPI implementations on host and container necessarily need not to be the same, but should be ABI (or binary) compatible. Also, the MPI version inside the container should be the same or newer than the host.
  • Bind mount InfiniBand libraries into the container.

The example below shows how to obtain a Docker image and run the container using MPI by binding the host MPI and InfiniBand libraries.

$ module load singularity/3.0.0
$ singularity pull docker://underworldcode/underworld2:dev
$ export SINGULARITYENV_LD_LIBRARY_PATH=/opt/apps/mpich/gcc/6.1.0/3.2/lib:$LD_LIBRARY_PATH
$ export SINGULARITY_BINDPATH="/opt/apps/mpich/gcc/6.1.0/3.2:/opt/apps/mpich/gcc/6.1.0/3.2:ro,/opt/mellanox:/opt/mellanox:ro,/opt/ibutils:/opt/ibutils:ro"
$ srun --mpi=pmi2 -N 2 -n 4 singularity exec underworld2_dev.sif python path/to/script

Containers with GPU support

Singularity allows you to run GPU-enabled container by simply adding --nv option to exec or run commands. The following example shows how to interactively run a GPU-enabled container on the HPC cluster.

$ idev -N 1 --ntasks=1 --gres=gpu:1 -t 90
...
salloc: Granted job allocation 5911427
Allocated nodes: sn12
[user.name@sn12 /path/to/workspace]$ module load singularity/3.0.0 
[user.name@sn12 /path/to/workspace]$ git clone https://github.com/tensorflow/models.git
Cloning into 'models'...
remote: Enumerating objects: 95, done.
remote: Counting objects: 100% (95/95), done.
remote: Compressing objects: 100% (67/67), done.
remote: Total 23695 (delta 52), reused 55 (delta 28), pack-reused 23600
Receiving objects: 100% (23695/23695), 563.07 MiB | 43.51 MiB/s, done.
Resolving deltas: 100% (13941/13941), done.
Checking out files: 100% (2798/2798), done.

$ singularity exec --nv docker://tensorflow/tensorflow:latest-gpu python ./models/tutorials/image/mnist/convolutional.py
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob sha256:18d680d616571900d78ee1c8fff0310f2a2afe39c6ed0ba2651ff667af406c3e
 41.34 MiB / 41.34 MiB [====================================================] 0s
...
Writing manifest to image destination
Storing signatures
INFO:    Creating SIF file...
INFO:    Build complete: /home/user.name/.singularity/cache/oci-tmp/847690afb29977920dbdbcf64a8669a2aaa0a202844fe80ea5cb524ede9f0a0b/tensorflow_latest-gpu.sif
INFO:    Image cached as SIF at /home/user.name/.singularity/cache/oci-tmp/847690afb29977920dbdbcf64a8669a2aaa0a202844fe80ea5cb524ede9f0a0b/tensorflow_latest-gpu.sif
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
2018-12-20 17:09:04.128519: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-20 17:09:04.540442: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:04:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-12-20 17:09:04.790309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:05:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-12-20 17:09:05.059855: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 2 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:84:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-12-20 17:09:05.320347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 3 with properties: 
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:85:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2018-12-20 17:09:05.321053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2018-12-20 17:09:13.020318: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-20 17:09:13.020372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988]      0 1 2 3 
2018-12-20 17:09:13.020382: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0:   N Y N N 
2018-12-20 17:09:13.020390: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1:   Y N N N 
2018-12-20 17:09:13.020396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2:   N N N Y 
2018-12-20 17:09:13.020403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3:   N N Y N 
2018-12-20 17:09:13.028882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10756 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:04:00.0, compute capability: 3.7)
2018-12-20 17:09:13.029360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 10756 MB memory) -> physical GPU (device: 1, name: Tesla K80, pci bus id: 0000:05:00.0, compute capability: 3.7)
2018-12-20 17:09:13.029730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 10756 MB memory) -> physical GPU (device: 2, name: Tesla K80, pci bus id: 0000:84:00.0, compute capability: 3.7)
2018-12-20 17:09:13.030092: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 10756 MB memory) -> physical GPU (device: 3, name: Tesla K80, pci bus id: 0000:85:00.0, compute capability: 3.7)
Initialized!
Step 0 (epoch 0.00), 162.1 ms
Minibatch loss: 8.334, learning rate: 0.010000
Minibatch error: 85.9%
Validation error: 84.6%
Step 100 (epoch 0.12), 9.6 ms
Minibatch loss: 3.243, learning rate: 0.010000
Minibatch error: 6.2%
Validation error: 7.5%
...
Step 8500 (epoch 9.89), 8.1 ms
Minibatch loss: 1.599, learning rate: 0.006302
Minibatch error: 0.0%
Validation error: 0.9%
Test error: 0.8%

Building Singularity images

At this point you will likely want to know how to install your application(s) into a container or otherwise change it to suit your needs. For that you need to read Singularity’s documentation, linked below. That will provide information on how to create and use your own container images or modify existing ones. Note that we do not support the use of sudo with Singularity. If you need sudo rights you can create a container that suits your needs using your own workstation or in a virtual machine running Linux where you have sudo/root access. After you create the container or Singularity image you can copy it to the HPC cluster and run it.

Further Reading

Singularity documentation