Slurm¶
SLURM originally stood for Simple Linux Utility for Resource Management. Today it’s generally referred to as the Slurm Workload Manager. In short, slurm’s goal is to allocate the cluster’s resources to users’ jobs efficiently, fairly, and reproducibly. From the user point, it submits jobs and job arrays; gets predictable starts, dependencies, and resource guarantees.
Here you can see few starter templates you can paste into your scripts. Adjust partition names, cores, memory, and times to match your policies.
- Avoid running interactive jobs via
srunexcept of short runs!- Use sbatch whenevar possible!
Minimal CPU job: minimal_script.sh¶
#!/usr/bin/env bash
#SBATCH -p <partition>
#SBATCH -J myjob
#SBATCH -o %x.%j.out
#SBATCH -e %x.%j.err
#SBATCH -c 8 # CPUs for this task
#SBATCH --mem=16G # total memory
#SBATCH -t 02:00:00 # walltime
module purge
# module use /share/apps/Modules/modulefiles
# module load your_software/1.0
# Match threads to the Slurm allocation (replace/add for your libs)
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}
export MKL_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}
my_program --in input.fq --out result.txt
sbatch minimal_script.sh
Array job (with concurrency cap)¶
#!/usr/bin/env bash
#SBATCH -p <partition>
#SBATCH -J myarray
#SBATCH -a 1-100%10 # 100 tasks, run 10 at a time
#SBATCH -c 4
#SBATCH --mem=8G
#SBATCH -t 04:00:00
#SBATCH -o %x.%A_%a.out
#SBATCH -e %x.%A_%a.err
# optional:
# module purge
# module use /share/apps/Modules/modulefiles
# module load your_pipeline/2.0
INPUTS=(samples/*_R1.fastq.gz)
SAMPLE=${INPUTS[$((SLURM_ARRAY_TASK_ID-1))]}
# match threads to allocation (if pipeline is multithreaded)
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}
export MKL_NUM_THREADS=${SLURM_CPUS_PER_TASK:-1}
my_pipeline --reads "$SAMPLE"
Submit to nodes:
sbatch sbatch array_job.sh
Interactive shell. Use only for test runs or very small jobs¶
srun --pty -p <partition> -t 01:00:00 -c 4 --mem=8G bash
¶
srun --pty -p <partition> -t 01:00:00 -c 4 --mem=8G bash
MPI job (OpenMPI via Slurm)¶
#!/usr/bin/env bash
#SBATCH -p <partition>
#SBATCH -N 2 # nodes
#SBATCH --ntasks-per-node=16 # MPI ranks per node
#SBATCH -t 02:00:00
#SBATCH -J mpi_job
#SBATCH -o %x.%j.out
#SBATCH -e %x.%j.err
module purge
# module use /share/apps/Modules/modulefiles
module load openmpi
# Launch ranks with OpenMPI using Slurm's allocation (no srun)
mpirun -np ${SLURM_NTASKS} --map-by ppr:${SLURM_NTASKS_PER_NODE}:node ./mpi_app
# (optional) add: --bind-to core
Job management quick reference¶
# Submit
sbatch job.sh
# Queue & running jobs
squeue -u $USER
squeue -t RUNNING -o "%i %u %T %P %N"
# Detailed job info
scontrol show job <jobid>
# History (after completion)
sacct -j <jobid> --format=JobID,JobName,Partition,State,Elapsed,AllocTRES,MaxRSS,MaxVMSize -P
# Cancel
scancel <jobid> # one
scancel -u $USER # all mine