Containerization on Summit

When installing software, you may come across applications that have complex chains of dependencies that are challenging to compile and install. Some software may require very specific versions of libraries that may not be available on Summit or conflict with libraries needed for other applications. You may also need to move between several workstations or HPC platforms, which often requires reinstalling your software on each system. Containers are a good way to tackle all of these issues and more.

Containerization Fundamentals

Containers build upon an idea that has long existed within computing: hardware can be emulated through software. Virtualization simulates some or all components of computation through a software application. Virtual machines use this concept to generate an entire operating system as an application on a host system. Containers follow the same idea, but at a much smaller scale and contained within a system’s kernel

Containers are portable compartmentalizations of some or all of the following: An operating system, software, libraries, data, and workflows. Containers offer:

  • Portability: containers can run on any system equipped with its specified container manager.
  • Reproducibility: because containers are instances of prebuilt isolated software, software will always execute the same every time.

Containers distinguish themselves through their low computational overhead and their ability to utilize all of a host system’s resources. Building containers is a relatively simple process that starts with a container engine.

Docker

Docker is by far the most popular container engine, and can be used on any system where you have administrative privileges. Because of this need for administrative privileges, Docker containers cannot be built or run directly on Research Computing resources. To utilize a Docker container on Research Computing resources please build a singularity image using a Docker image as a base.

See the documentation on Singularity (below) if you wish to run a Docker container on RMACC Summit or Blanca.

Singularity

Singularity is a containerization software package that does not require users to have administrative privileges when running containers, and can thus be safely used on Research Computing resources such as RMACC Summit and Blanca. Singularity is preinstalled on Research Computing resources, so all that is needed to run Singularity containers is to load the Singularity module on a compute node on RMACC Summit or Blanca:

module load singularity/3.0.2

Much like Docker, Singularity is a containerization software designed around compartmentalization of applications, libraries, and workflows. This is done through the creation of Singularity images which can be run as ephemeral Singularity containers. Unlike Docker, however, Singularity does not manage images, containers, or volumes through a central application. Instead, Singularity generates saved image files that can either be mutable or immutable based on compression.

Singularity Hub

Singularity Hub is a container registry that allows users to pull images from a server and into a system with Singularity installed. Singularity Hub uses Github to host image recipes, builds images in the cloud from these recipes, and places the resulting images in the Singularity Hub registry. .

Note: You do not need an account with Github if you only wish to pull Singularity images.

https://singularity-hub.org/

Singularity Hub has a variety of useful prebuilt images for different software packages and workflows so be sure to check if the software you need is already available.

Note: As of 2019, there are presently two Singularity container registries. The former is Singularity Hub, described above, which is managed by Stanford University and Lawrence Berkeley National Laboratory. The latter is the Sylabs Singularity Container Library, which was created in late 2018 when Singularity was spun off into the private company Sylabs. Below we provide documentation on how to pull images from either repository, and on how to build images on Singularity Hub via Github, and in the Sylabs Singularity Container Library using their “Remote Builder” functionality.

Pulling Singularity Images

Because we cannot build our own Singularity images on HPC systems, we must instead bring our images over from another location. Pulling images from public repositories is often the easiest solution to this problem.

We can use the singularity pull command to remotely download our chosen image file. The command requires the container registry we would like to use, followed by the repository’s name:

singularity pull <container-registry>://<repository-name>

A container registry is simply a server that manages uploaded containers. Some examples of these container registries include Docker Hub, Singularity Hub, and the Singularity Container Library.

Pull from Docker Hub:

singularity pull docker://another:example

Pull from Singularity Hub:

singularity pull shub://example:repo

Pull from Singularity Container Library (Singularity version 3.0 and greater):

singularity pull library://example:repo

Lastly we can rename the Singularity image file pulled from a repository by utilizing the -n/--name flag.

singularity pull -n ExampleContainer.sif shub://example:tag

Example:

Pulling the Docker image of the latest tag of ubuntu can be done with the following command:

singularity pull docker://ubuntu:latest

Running a Singularity image as a container

Singularity images can be run as containers much like Docker images. Singularity commands, however, follow a bit more nuanced syntax depending on what you’d like to do. After pulling your image from either Docker Hub or Singularity Hub, you can run the image by using the singularity run command. Type:

singularity run <image-name>

Running a Singularity container will execute the container’s default program that is specified in container definition file. To execute specific programs in your container, we can use the singularity exec command, and then specify the program:

singularity exec <image-name> <program>

Much like specifying an application in Docker, this will allow a user to execute any program that is installed within your container. Unlike Docker however, you do not need to specify a shell application to shell into the container. We can simply use the singularity shell command:

singularity shell <image-name>

Example:

Say we have a Singularity image that contains python 3.7 as the default software, and we want to run python from the container. We can do this with the command:

singularity run python-cont.img

If the default application for the image is not python we could run python as follows:

singularity exec python-cont.img python

File Access

By default most user-owned files and directories are available to any container that is run on RMACC Summit and Blanca (this includes files in /home/$USER, /projects/$USER, /scratch/summit/$USER and /rc_scratch/$USER). This means that normally a user will not need to bind any folders to the container’s directory tree. Furthermore, a container will also have access to the files in the same folder where it was initialized.

Sometimes, however, certain folders that are not bound by default may be necessary to run your application. To bind any additional folders or files to your Singularity container, you can utilize the -B flag in your singularity run, exec, and shell commands. To bind an additional folder to your Singularity container, type:

singularity run -B /source/directory:/target/directory sample-image.img

Additionally you can bind directories by utilizing the SINGULARITY_BINDPATH environment variable. Simply export a list of directory pairs you would like to bind to the your container:

export SINGULARITY_BINDPATH=/source/directory1:/target/directory1,\
/source/directory2:/target/directory2

Then run, execute, or shell into the container as normal.

Building a Singularity image

Important: You cannot build Singularity images directly on Summit. If you cannot build an image on your local machine you will need to build it on Singularity Hub or Sylabs Remote Builder.

Singularity Build

Just like Docker, Singularity allows a user to build images using a definition file. The file is saved with the name “Singularity” and contains instructions on how to prepare a Singularity image file. Just like a Dockerfile, this file has a variety of directives that allow for the customization of your image. A sample image would look something like this:

Bootstrap: shub
From: ubuntu

%help
	I am help text!

%setup		
	apt-get update
	apt-get install nano
	apt-get install gcc 

%runscript
	echo “hello! I am a container!”

Once you have written your Singularity recipe, you can build the application either remotely (see below) or locally with the singularity build command. To build a Singularity image locally, type:

sudo singularity build <img-name.img> <recipe-name.def>

Again, it is important to note that if you build an image locally as described above, you must build your image on a computer that you have administrative privileges on. If you do not have administrative privileges you will not be able to build the container in this manner. Fortunately, there are other ways to build containers remotely, which are discussed next.

Building Images Remotely with Singularity Hub

To build images with Singularity Hub, you must first create a Github account at https://github.com/join if you do not have one already. After completing this step log into your github account and create an empty repository.

After creating your repository, upload a Singularity definition file named “Singularity” to the repository. This is all we need to generate our Singularity image.

Now, log into Singularity Hub with your Github credentials and navigate to “My Container Collections” and click the link “Add a Collection.” From here a list of Github repositories you contribute to will be listed. Simply click the button on the repository you wish to add to Singularity Hub.

Your container should build automatically if you have a recipe file named “Singularity” within your repository. By default Singularity Hub will attempt to build any time something is pushed to the github repository. This can be changed in the settings tab in the containers build page. If the build fails the first time, revise the Singularity recipe and the build will initiate again.

More on building containers: https://www.sylabs.io/guides/3.0/user-guide/build_a_container.html

Building Images Remotely with the Singularity Remote Builder

With Singularity 3.0, users have the ability to build containers remotely through Sylabs remote builder. Unlike Singularity Hub though, the Singularity remote builder can be utilized directly on the command line from RMACC Summit or Blanca without needing to upload to a repository.

To begin using Singularity Remote Builder, navigate to your home directory and run the commands:

mkdir .singularity
cd .singularity 

Now on your local machine, navigate to: https://cloud.sylabs.io/auth

and log into Sylabs with your Google, Github, Gitlab, or Microsoft account. Once you have logged into Sylabs, provide a label for your token under the field “Create A New Access Token” and click “Create a new Token.” This will generate a large string that will be read by Singularity on RMACC Summit or Blanca.

Now on RMACC Summit or Blanca run the command:

echo “<your-token>” > sylabs-token

After this you can now build containers through the Sylabs remote builder on RMACC Summit or Blanca. Simply load Singularity 3.0.2 into your module stack and run the command:

singularity build --remote <desired-image-name> <your-recipe>

Building MPI-enabled Singularity images

MPI-enabled Singularity containers can be deployed on RMACC Summit with the caveat that the MPI software within the container stays consistent with MPI software available on the system. This requirement diminishes the portability of MPI-enabled containers, as they may not run on other systems without compatible MPI software. Regardless, MPI-enabled containers can still be a very useful option in many cases.

Here we provide an example of using a gcc compiler with OpenMPI. RMACC Summit uses an Omni-Path interconnect (a low latency network fabric that enables MPI to be efficiently implemented across nodes). In order to use a Singularity container with OpenMPI (or any MPI) on Summit, there are two requirements:

Singularity container needs to have Omni-Path libraries installed inside. OpenMPI needs to be installed both inside and outside of the Singularity container. More specifically, the SAME version of OpenMPI needs to be installed inside and outside (at least very similar, you can sometimes get away with two different minor versions, ex: 2.1 and 2.0).

The following Singularity recipe ensures that OpenMPI 2.0.1 is installed in the image, which matches the openmpi/2.0.1 module that is available on RMACC Summit. This recipe can be used as a template to build your own MPI-enabled container images for RMACC Summit and can be found at: https://github.com/ResearchComputing/core-software/tree/master/singularity

Once you’ve built the container with one of the methods outlined above, you can place it on RMACC Summit and run it on a compute node. The following is an example of running a gcc/OpenMPI container with Singularity on RMACC Summit. The syntax is a normal MPI run where multiple instances of a Singularity image are run. The following example runs mpi_hello_world with MPI from a container.

ml gcc/6.1.0
ml openmpi/2.0.1
ml singularity/3.0.2

mpirun -np 4 singularity exec openmpi.sif mpi_hello_world"

Note that it is also possible to build intel/IMPI containers for use on RMACC Summit, which are likely to have enhanced performance on Summit’s intel architecture compared to gcc/OpenMPI containers. If you would like assistance building MPI-enabled containers contact rc-help@colorado.edu .