Slurm Job Script Templates
The following sections offer Slurm job script templates and descriptions for various use cases on CARC high-performance computing (HPC) clusters.
If you’re not familiar with the Slurm job scheduler or submitting jobs, please see the guide for Running Jobs.
The Slurm option --cpus-per-task
refers to logical CPUs. On CARC clusters, compute nodes have two sockets with one physical multi-core processor per socket, such that 1 logical CPU = 1 core = 1 thread. These terms may be used interchangeably.
0.0.1 Single-threaded jobs
A single-threaded (or single-core or serial) job uses 1 CPU (core/thread) on 1 compute node. This is the most basic job that can be submitted, but it does not fully utilize the compute resources available on CARC HPC clusters.
An example job script:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=main
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem=8G
#SBATCH --time=1:00:00
module purge
module load julia/1.9.3
julia script.jl
The --cpus-per-task
option requests the specified number of CPUs. There is 1 thread per CPU, so only 1 CPU is needed for a single-threaded job.
0.0.2 Multi-threaded jobs
A multi-threaded (or multi-core or multi-process) job uses multiple CPUs (cores/threads) with shared memory on 1 compute node. This is a common use case as it enables basic parallel computing.
An example job script:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=main
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=1:00:00
module purge
module load julia/1.9.3
julia --threads $SLURM_CPUS_PER_TASK script.jl
The --cpus-per-task
option requests the specified number of CPUs. There is 1 thread per CPU, so multi-threaded jobs require more than 1 CPU. The number of CPUs and amount of memory that can be used varies across compute nodes. For more details, see the Discovery resource overview or the Endeavour resource overview.
Please note that you may have to modify your scripts and programs to explicitly use multiple CPUs (cores/threads), depending on the application or programming language you are using.
Some compiled programming languages like C, C++, and Fortran use OpenMP for multi-threading. In these cases, you should compile your programs with an openmp
flag and explicitly set the environment variable OMP_NUM_THREADS
(number of threads to parallelize over) in your job scripts. The OMP_NUM_THREADS
count should equal the requested --cpus-per-task
option in the job script. You can use the Slurm-provided environment variable SLURM_CPUS_PER_TASK
to set OMP_NUM_THREADS
.
An example job script using OpenMP:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=main
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=1:00:00
module purge
module load gcc/11.3.0
ulimit -s unlimited
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./omp_program
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.
0.0.3 Single-threaded MPI jobs
The Message Passing Interface (MPI) is a message-passing standard used in parallel programming, typically for multi-node, distributed processor and distributed memory use cases.
An example job script for a single-threaded MPI program:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=epyc-64
#SBATCH --constraint=epyc-7542
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=64
#SBATCH --cpus-per-task=1
#SBATCH --mem=0
#SBATCH --time=24:00:00
module purge
module load gcc/11.3.0
module load openmpi/4.1.4
ulimit -s unlimited
srun --mpi=pmix_v2 -n $SLURM_NTASKS ./mpi_program
The --constraint
option specifies a node feature (CPU model) to use. The --nodes
option specifies how many nodes to use, and the --ntasks-per-node
option specifies the number of tasks (MPI ranks) to run per node. The --cpus-per-task
option specifies the number of CPUs (threads) to use per task. There is 1 thread per CPU, so only 1 CPU per task is needed for a single-threaded MPI job. The --mem=0
option requests all available memory per node. Alternatively, you could use the --mem-per-cpu
option.
For more information, see the Using MPI user guide.
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.
0.0.4 Multi-threaded MPI jobs
Multi-threaded MPI programs use multi-threaded tasks, typically hybrid MPI/OpenMP programs. If using OpenMP for threading, the environment variable OMP_NUM_THREADS
should be set, which specifies the number of threads to parallelize over. The OMP_NUM_THREADS
count should equal the requested --cpus-per-task
option in the job script. You can use the Slurm-provided environment variable SLURM_CPUS_PER_TASK
to set OMP_NUM_THREADS
.
An example job script for a multi-threaded MPI program:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=epyc-64
#SBATCH --constraint=epyc-7542
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=32
#SBATCH --mem=0
#SBATCH --time=24:00:00
module purge
module load gcc/11.3.0
module load openmpi/4.1.4
ulimit -s unlimited
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun --mpi=pmix_v2 -n $SLURM_NTASKS -c $SLURM_CPUS_PER_TASK ./mpi_plus_openmp_program
The --constraint
option specifies a node feature (CPU model) to use. The --nodes
option specifies how many nodes to use, and the --ntasks-per-node
option specifies the number of tasks (MPI ranks) to run per node. The --cpus-per-task
option specifies the number of CPUs (threads) to use per task. The --mem=0
option requests all available memory per node. Alternatively, you could use the --mem-per-cpu
option.
For more information, see the Using MPI user guide.
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.
0.0.5 GPU jobs
Some programs can take advantage of the unique hardware architecture in a graphics processing unit (GPU). GPUs can be used for specialized scientific computing work, including 3D modelling and machine learning.
An example job script for GPUs:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=gpu
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --gpus-per-task=a40:1
#SBATCH --mem=16G
#SBATCH --time=1:00:00
module purge
module load nvhpc/22.11
./program
For more information, see the GPUs guide.
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and GPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.
0.0.6 Job arrays
Job arrays allow you to use a single job script to launch many similar jobs. Common use cases include:
- Varying model parameters (e.g., running simulations with different conditions)
- Processing different input files (e.g., running data analysis pipelines with different datasets)
An example job script for a job array:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=main
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G
#SBATCH --time=1:00:00
#SBATCH --array=1-10
module purge
module load gcc/11.3.0
./my_program --input=input_${SLURM_ARRAY_TASK_ID} --output=output_${SLURM_ARRAY_TASK_ID}
This example uses a program my_program
that takes the options --input-file
and --output-file
to specify the paths to the input and output files, assuming the input files are named similar to input_1, input_2, etc. Job scripts for job arrays typically make use of the environment variable SLURM_ARRAY_TASK_ID
to iterate over parameters or input files in some manner. This example job script would launch 10 jobs with the same sbatch
options but using the different input files and creating different output files, based on the SLURM_ARRAY_TASK_ID
index (in this example, 1-10). Array job 1 would use input_1 and create output_1, array job 2 would use input_2 and create output_2, etc. This is one possible setup for an array job, but alternative setups are also possible.
Please note that the job array size should match the input size (i.e., the number of simulations to run or the number of datasets to process). Make sure that the resources you request are sufficient for one individual job (not the entire array as a whole).
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and GPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.
0.0.7 Job packing
Job packing refers to packing multiple tasks (or sub-jobs) into one Slurm job. This is useful for running a small number of tasks at the same time on different cores within the same Slurm job.
For a large number of tasks, this srun
approach should be avoided because it negatively impacts the job scheduler. Use a workflow tool instead. For running a large number of short-running tasks, use Launcher; see the Launcher user guide.
An example job script for job packing:
#!/bin/bash
#SBATCH --account=<project_id>
#SBATCH --partition=main
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=16G
#SBATCH --time=24:00:00
module purge
module load gcc/11.3.0
module load openblas/0.3.20
module load r/4.3.2
srun --ntasks=1 --exclusive --output=slurm-%J.out Rscript script1.R &
srun --ntasks=1 --exclusive --output=slurm-%J.out Rscript script2.R &
srun --ntasks=1 --exclusive --output=slurm-%J.out Rscript script3.R &
srun --ntasks=1 --exclusive --output=slurm-%J.out Rscript script4.R &
wait
The srun
command launches individual tasks on different cores. Using &
at the end of each srun
line runs the task in the background so that the next task can then be launched. Combined with the wait
command, this allows the tasks to run in parallel and the job to exit only once all tasks are completed.
Keep in mind that your project accounts are charged based on resources allocated, so only request as many CPUs and GPUs and as much memory as needed. You may need to experiment with the requests and monitor your job to find the optimal resource requests. Also keep in mind that requesting fewer resources will typically result in shorter job queue times.