Containers with Apptainer (former Singularity) on the HPC Cluster¶
Modern Linux systems often use containers. A container is a normal Linux process that runs with its own isolated view of the filesystem, network, and processes, created by kernel namespaces and cgroups. Applications are packaged as images containing all necessary binaries, libraries, and configs; the runtime starts a process from the image with that isolation.
In bioinformatics, the common formats are Docker and Apptainer. Docker relies on a privileged daemon and is generally unsuitable on multi-user HPC clusters. Apptainer, on the opther hand, is HPC-oriented: no root daemon, integrates with Slurm and shared filesystems, runs .sif images, and can pull from Docker/OCI registries.
This page shows Apptainer basics and how to use it with Slurm, using rMATS-turbo as an example.
All .sif containers are located in /share/apps/Containers, so they are accessible in all nodes.
/share/apps/containers/rmats-turbo-0.1.sif
TL;DR
All modules using Apptainer were modified to expose contained programs. No need to specify apptainer exec ... any more.
So instead of:
apptainer exec --cleanenv --bind "$BIND" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
rmats.py -h
You can still follow the apptainer path outlined below.
Quick start¶
Load Apptainer¶
Not provided via modules anymore, apptainer is now baked to the VNFS.
Sanity checks¶
apptainer --version
apptainer exec /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
If you see the rMATS help, the image is usable.
Binding data paths¶
By default Apptainer binds your $HOME and current working directory. Any additional read/write locations must be binded explicitly so the container can see them:
# This is an example: bind project data, reference, and a fast scratch directory
BIND="/dbase/genomes:/ref,/scratch/$USER:/tmp"
apptainer exec --cleanenv --bind "$BIND" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
- Left side of
:is the host path; right side is the mount point inside the container. --cleanenvavoids accidental leakage of host modules/vars into the container.
rMATS-turbo must be used with Slurm (batch job) to be run on working nodes¶
Create text files listing BAM/FASTQ for each group (one per line), for example:
b1.txt # Group 1
b2.txt # Group 2
rmats_job.sh
#!/usr/bin/env bash
#SBATCH -J rmats
#SBATCH -p general # pick a suitable CPU partition
#SBATCH -c 16 # threads for rMATS (--nthread)
#SBATCH --mem=64G # adjust to your dataset
#SBATCH -t 24:00:00
#SBATCH -o %x.%j.out
#SBATCH -e %x.%j.err
set -euo pipefail
module use /share/apps/Modules/modulefiles
module load apptainer
# Paths (edit as needed)
IMG=/share/apps/containers/rmats-turbo-0.1.sif
GTF=/ref/annotation.gtf # inside-container path (see BIND below)
B1=$PWD/b1.txt # host path; we will bind PWD
B2=$PWD/b2.txt
OUT=$PWD/rmats_out
TMP=/scratch/$USER/rmats_${SLURM_JOB_ID}
mkdir -p "$OUT" "$TMP" /scratch/$USER
# Bind host paths to container mount points:
BIND="$PWD:$PWD,/share:/share,/dbase:/dbase,/scratch/$USER:/scratch,/ref:/dbase/genomes" # example: host /dbase/genomes becomes /ref in container
# rMATS run (CPU; --nthread matches SLURM -c)
srun apptainer exec --cleanenv --bind "$BIND" "$IMG" rmats.py --b1 "$B1" --b2 "$B2" --gtf "$GTF" -t paired --libType fr-firststrand \ --readLength 150 \ --nthread ${SLURM_CPUS_PER_TASK:-16} \ --od "$OUT" \ --tmp "$TMP"
echo "Done. Results in: $OUT"
Submit:
sbatch rmats_job.sh
Notes
• Use the correct--libType(fr-firststrand,fr-secondstrand, orfr-unstranded) and--readLengthfor your data.
• If your GTF is on/dbase/genomesor another location, reflect that inBINDandGTF.
• For large cohorts, increase--memand wall time as needed. rMATS writes many intermediate files; keep--tmpon fast storage.
Interactive test (before submitting)¶
You can test your program run for a limited time by using srun.
Once everything is OK, use sbatch sctipt for the dinal analysis.
srun --pty -p interactive -c 4 --mem=8G -t 01:00:00 bash
module load apptainer
apptainer exec --cleanenv --bind "$PWD:$PWD" /share/apps/containers/rmats-turbo-0.1.sif rmats.py -h
exit
Tips & troubleshooting¶
- Command not found inside image: run an interactive shell to inspect:
apptainer exec /share/apps/containers/rmats-turbo-0.1.sif bash -lc 'which rmats.py && rmats.py -h' - Files not visible in container: add the parent directory to
--bindor run from the directory that holds the inputs. - Module conflicts/environment noise: always add
--cleanenv. - Threading: match
--nthreadto#SBATCH -c(and avoid oversubscription). - Temporary space: point
--tmpto a fast local or shared scratch with sufficient quota. - No conda needed: the container carries its own dependencies; avoid installing old Python for rMATS on the host.
See also¶
Note: conda and bioconda, often used in bioinformatics, are not recommended in HPC systems and therefore will not be available.