Skip to content

Containers with Apptainer (former Singularity) on the HPC Cluster

Modern Linux systems often use containers. A container is a normal Linux process that runs with its own isolated view of the filesystem, network, and processes, created by kernel namespaces and cgroups. Applications are packaged as images containing all necessary binaries, libraries, and configs; the runtime starts a process from the image with that isolation.

In bioinformatics, the common formats are Docker and Apptainer. Docker relies on a privileged daemon and is generally unsuitable on multi-user HPC clusters. Apptainer, on the opther hand, is HPC-oriented: no root daemon, integrates with Slurm and shared filesystems, runs .sif images, and can pull from Docker/OCI registries.

This page shows Apptainer basics and how to use it with Slurm, using rMATS-turbo as an example.

All .sif containers are located in /share/apps/Containers, so they are accessible in all nodes.

/share/apps/containers/rmats-turbo-0.1.sif

TL;DR

All modules using Apptainer were modified to expose contained programs. No need to specify apptainer exec ... any more. So instead of:

apptainer exec --cleanenv --bind "$BIND" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
Use the program directly:
rmats.py -h

You can still follow the apptainer path outlined below.

Quick start

Load Apptainer

Not provided via modules anymore, apptainer is now baked to the VNFS.

Sanity checks

apptainer --version
apptainer exec /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h

If you see the rMATS help, the image is usable.


Binding data paths

By default Apptainer binds your $HOME and current working directory. Any additional read/write locations must be binded explicitly so the container can see them:

# This is an example: bind project data, reference, and a fast scratch directory
BIND="/dbase/genomes:/ref,/scratch/$USER:/tmp"
apptainer exec --cleanenv --bind "$BIND" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
  • Left side of : is the host path; right side is the mount point inside the container.
  • --cleanenv avoids accidental leakage of host modules/vars into the container.

rMATS-turbo must be used with Slurm (batch job) to be run on working nodes

Create text files listing BAM/FASTQ for each group (one per line), for example:

b1.txt  # Group 1
b2.txt  # Group 2
Typical rMATS inputs are coordinate-sorted BAMs from a splice-aware aligner (e.g., STAR).

rmats_job.sh

#!/usr/bin/env bash
#SBATCH -J rmats
#SBATCH -p general            # pick a suitable CPU partition
#SBATCH -c 16                 # threads for rMATS (--nthread)
#SBATCH --mem=64G             # adjust to your dataset
#SBATCH -t 24:00:00
#SBATCH -o %x.%j.out
#SBATCH -e %x.%j.err

set -euo pipefail

module use /share/apps/Modules/modulefiles
module load apptainer

# Paths (edit as needed)
IMG=/share/apps/containers/rmats-turbo-0.1.sif
GTF=/ref/annotation.gtf                 # inside-container path (see BIND below)
B1=$PWD/b1.txt                          # host path; we will bind PWD
B2=$PWD/b2.txt
OUT=$PWD/rmats_out
TMP=/scratch/$USER/rmats_${SLURM_JOB_ID}

mkdir -p "$OUT" "$TMP" /scratch/$USER

# Bind host paths to container mount points:
BIND="$PWD:$PWD,/share:/share,/dbase:/dbase,/scratch/$USER:/scratch,/ref:/dbase/genomes"   # example: host /dbase/genomes becomes /ref in container

# rMATS run (CPU; --nthread matches SLURM -c)
srun apptainer exec --cleanenv --bind "$BIND" "$IMG" rmats.py     --b1 "$B1"     --b2 "$B2"     --gtf "$GTF"     -t paired     --libType fr-firststrand \    --readLength 150 \    --nthread ${SLURM_CPUS_PER_TASK:-16} \    --od "$OUT" \    --tmp "$TMP"

echo "Done. Results in: $OUT"

Submit:

sbatch rmats_job.sh

Notes
• Use the correct --libType (fr-firststrand, fr-secondstrand, or fr-unstranded) and --readLength for your data.
• If your GTF is on /dbase/genomes or another location, reflect that in BIND and GTF.
• For large cohorts, increase --mem and wall time as needed. rMATS writes many intermediate files; keep --tmp on fast storage.


Interactive test (before submitting)

You can test your program run for a limited time by using srun. Once everything is OK, use sbatch sctipt for the dinal analysis.

srun --pty -p interactive -c 4 --mem=8G -t 01:00:00 bash
module load apptainer
apptainer exec --cleanenv --bind "$PWD:$PWD" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
exit

Tips & troubleshooting

  • Command not found inside image: run an interactive shell to inspect:
    apptainer exec /share/apps/containers/rmats-turbo-0.1.sif bash -lc 'which rmats.py && rmats.py -h'
    
  • Files not visible in container: add the parent directory to --bind or run from the directory that holds the inputs.
  • Module conflicts/environment noise: always add --cleanenv.
  • Threading: match --nthread to #SBATCH -c (and avoid oversubscription).
  • Temporary space: point --tmp to a fast local or shared scratch with sufficient quota.
  • No conda needed: the container carries its own dependencies; avoid installing old Python for rMATS on the host.

See also


Note: conda and bioconda, often used in bioinformatics, are not recommended in HPC systems and therefore will not be available.