Slurm Cheatsheet

Last updated January 12, 2025

Table of Contents

0.0.1 Custom CARC Slurm commands
0.0.2 Job submission
0.0.3 Job management
0.0.4 Job accounting
0.0.5 Partition and node information
0.0.6 Output environment variables

A compact reference for Slurm commands and useful options, with examples.

0.0.1 Custom CARC Slurm commands

myaccount - View account information for user
noderes - View node resources
jobqueue - View job queue information
jobhist - View compact history of user’s jobs
jobinfo - View detailed job information

Each command has an associated help page (e.g., jobinfo --help).

0.0.2 Job submission

salloc - Obtain a job allocation for interactive use (docs)
sbatch - Submit a batch script for later execution (docs)
srun - Obtain a job allocation and run an application (docs)

Option	Description
-A, –account=<account>	Account to be charged for resources used
-a, –array=<index>	Job array specification (sbatch only)
-b, –begin=<time>	Initiate job after specified time
-C, –constraint=<features>	Required node features
–cpu-bind=<type>	Bind tasks to specific CPUs (srun only)
-c, –cpus-per-task=<count>	Number of CPUs required per task
-d, –dependency=<state:jobid>	Defer job until specified jobs reach specified state
-m, –distribution=<method[:method]>	Specify distribution methods for remote processes
-e, –error=<filename>	File in which to store job error messages (sbatch and srun only)
-x, –exclude=<name>	Specify host names to exclude from job allocation
–exclusive	Reserve all CPUs and GPUs on allocated nodes
–export=<name=value>	Export specified environment variables (e.g., all, none)
–gpus-per-task=<list>	Number of GPUs required per task
-J, –job-name=<name>	Job name
-l, –label	Prepend task ID to output (srun only)
–mail-type=<type>	E-mail notification type (e.g., begin, end, fail, requeue, all)
–mail-user=<address>	E-mail address
–mem=<size>[units]	Memory required per allocated node (e.g., 16GB)
–mem-per-cpu=<size>[units]	Memory required per allocated CPU (e.g., 2GB)
-w, –nodelist=<hostnames>	Specify host names to include in job allocation
-N, –nodes=<count>	Number of nodes required for the job
-n, –ntasks=<count>	Number of tasks to be launched
–ntasks-per-node=<count>	Number of tasks to be launched per node
-o, –output=<filename>	File in which to store job output (sbatch and srun only)
-p, –partition=<names>	Partition in which to run the job
–signal=[B:]<num>[@time]	Signal job when approaching time limit
-t, –time=<time>	Limit for job run time

Examples:

# Request interactive job on debug node with 4 CPUs
salloc -p debug -c 4

# Request interactive job with V100 GPU
salloc -p gpu --ntasks=1 --gpus-per-task=v100:1

# Submit batch job
sbatch batch.job

0.0.3 Job management

squeue - View information about jobs in scheduling queue (docs)

Option	Description
-A, –account=<account_list>	Filter by accounts (comma-separated list)
-o, –format=<options>	Output format to display
-j, –jobs=<job_id_list>	Filter by job IDs (comma-separated list)
-l, –long	Show more available information
–me	Filter by your own jobs
-n, –name=<job_name_list>	Filter by job names (comma-separated list)
-p, –partition=<partition_list>	Filter by partitions (comma-separated list)
-P, –priority	Sort jobs by priority
–start	Show the expected start time and resources to be allocated for pending jobs
-t, –states=<state_list>	Filter by states (comma-separated list)
-u, –user=<user_list>	Filter by users (comma-separated list)

Examples:

# View your own job queue with estimated start times
squeue --me

# View own job queue with estimated start times for pending jobs
squeue --me --start

# View job queue on specified partition in long format
squeue -lp epyc-64

scancel - Signal or cancel jobs, job arrays, or job steps (docs)

Option	Description
-A, –account=<account>	Restrict to the specified account
-n, –name=<job_name>	Restrict to jobs with specified name
-w, –nodelist=<hostnames>	Restrict to jobs using the specified host names (comma-separated list)
-p, –partition=<partition>	Restrict to the specified partition
-u, –user=<username>	Restrict to the specified user

Examples:

# Cancel specific job
scancel 111111

# Cancel all your own jobs
scancel -u $USER

# Cancel your own jobs on specified partition
scancel -u $USER -p oneweek

# Cancel your own jobs in specified state
scancel -u $USER -t pending

sprio - View job scheduling priorities (docs)

Option	Description
-o, –format=<options>	Output format to display
-j, –jobs=<job_id_list>	Filter by job IDs (comma-separated list)
-l, –long	Show more available information
-n, –norm	Show the normalized priority factors
-p, –partition=<partition_list>	Filter by partitions (comma-separated list)
-u, –user=<user_list>	Filter by users (comma-separated list)

Examples:

# View normalized job priorities for your own jobs
sprio -nu $USER

# View normalized job priorities for specified partition
sprio -nlp gpu

0.0.4 Job accounting

sacct - View job accounting data (docs)

Option	Description
-A, –account=<account_list>	Filter by accounts (comma-separated list)
-X, –allocations	Show job allocations, but not job steps
-a, –allusers	Show jobs for all users
-E, –endtime=<time>	End of reporting period
-o, –format=<options>	Output format to display
-j, –jobs=<job_id_list>	Filter by job IDs (comma-separated list)
–name=<job_name_list>	Filter by job names (comma-separated list)
-N, –nodelist=<hostnames>	Filter by host names (comma-separated list)
-r, –partition=<partition_list>	Filter by partitions (comma-separated list)
-S, –starttime=<time>	Start of reporting period
-s, –state=<state_list>	Filter by states (comma-separated list)
-u, –user=<user_list>	Filter by users (comma-separated list)

Examples:

# View accounting data for specific job with custom format
sacct -j 111111 --format=jobid,jobname,submit,exitcode,elapsed,reqnodes,reqcpus,reqmem

# View compact accounting data for your own jobs for specified time range
sacct -X -S 2022-07-01 -E 2022-07-14

sacctmgr - View or modify account information (docs)

sacctmgr show associations
sacctmgr show user <username>

Option	Description
cluster=<clusters>	Filter by clusters (e.g., condo, discovery)
format=<options>	Output format to display
user=<user_list>	Filter by users (comma-separated list)

Examples:

# View your own associations with custom format
sacctmgr show associations user=$USER format=cluster,account,user,qos

sreport - Generate reports from accounting data (docs)

sreport cluster accountutilizationbyuser
sreport cluster userutilizationbyaccount
sreport job sizesbyaccount
sreport user topusage

Option	Description
-T, –tres=<resource_list>	Resources to report (e.g., cpu, gpu, mem, billing, all)
clusters=<clusters>	Filter by clusters (e.g., condo, discovery)
end=<time>	End of reporting period
format=<options>	Output format to display
start=<time>	Start of reporting period
accounts=<account_list>	Filter by accounts (comma-separated list)
users=<user_list>	Filter by users (comma-separated list)
nodes=<hostnames>	Filter by host names (comma-separated list) (job reports only)
partitions=<partition_list>	Filter by partitions (comma-separated list) (job reports only)
printjobcount	Print number of jobs ran instead of time used (job reports only)

Examples:

# Report account utilization for specified user and time range
sreport cluster accountutilizationbyuser start=2022-07-01 end=2022-07-14 users=$USER

# Report account utilization by user for specified account and time range
sreport cluster userutilizationbyaccount start=2022-07-01 end=2022-07-14 accounts=ttrojan_123

# Report job sizes for specified partition
sreport job sizesbyaccount partitions=epyc-64 printjobcount

# Report top users for specified account and time range
sreport user topusage start=2022-07-01 end=2022-07-14 accounts=ttrojan_123

0.0.5 Partition and node information

sinfo - View information about nodes and partitions (docs)

Option	Description
-o, –format=<options>	Output format to display
-l, –long	Show more available information
-N, –Node	Show information in a node-oriented format
-n, –nodes=<hostnames>	Filter by host names (comma-separated list)
-p, –partition=<partition_list>	Filter by partitions (comma-separated list)
-t, –states=<state_list>	Filter by node states (comma-separated list)
-s, –summarize	Show summary information

Examples:

# View all partitions and nodes by state
sinfo

# Summarize node states by partition
sinfo -s

# View nodes in idle state
sinfo --states=idle

# View nodes for specified partition in long, node-oriented format
sinfo -lNp epyc-64

scontrol - View or modify configuration and state (docs)

scontrol show partition <partition>
scontrol show node <hostname>
scontrol show job <job_id>

Option	Description
-d, –details	Show more details
-o, –oneliner	Show information on one line

scontrol hold <job_list>
scontrol release <job_list>
scontrol show hostnames

Examples:

# View information for specified partition
scontrol show partition epyc-64

# View information for specified node
scontrol show node b22-01

# View detailed information for running job
scontrol show job 111111 -d

# View hostnames for job (one name per line)
scontrol show hostnames

0.0.6 Output environment variables

Variable	Description
SLURM_ARRAY_TASK_COUNT	Number of tasks in job array
SLURM_ARRAY_TASK_ID	Job array task ID
SLURM_CPUS_PER_TASK	Number of CPUs requested per task
SLURM_JOB_ACCOUNT	Account used for job
SLURM_JOB_ID	Job ID
SLURM_JOB_NAME	Job Name
SLURM_JOB_NODELIST	List of nodes allocated to job
SLURM_JOB_NUM_NODES	Number of nodes allocated to job
SLURM_JOB_PARTITION	Partition used for job
SLURM_NTASKS	Number of job tasks
SLURM_PROCID	MPI rank of current process
SLURM_SUBMIT_DIR	Directory from which job was submitted
SLURM_TASKS_PER_NODE	Number of job tasks per node

Examples:

# Specify OpenMP threads
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

# Specify MPI tasks
srun -n $SLURM_NTASKS ./mpi_program