squeue status and reason codes

The squeue command details a variety of information on an active job’s status with state and reason codes. Job state codes describe a job’s current state in queue (e.g. pending, completed). Job reason codes describe the reason why the job is in its current state.

The following tables outline a variety of job state and reason codes you may encounter when using squeue to check on your jobs.

Job State Codes

Status Code Explaination
COMPLETED CD The job has completed successfully.
COMPLETING CG The job is finishing but some processes are still active.
FAILED F The job terminated with a non-zero exit code and failed to execute.
PENDING PD The job is waiting for resource allocation. It will eventually run.
PREEMPTED PR The job was terminated because of preemption by another job.
RUNNING R The job currently is allocated to a node and is running.
SUSPENDED S A running job has been stopped with its cores released to other jobs.
STOPPED ST A running job has been stopped with its cores retained.

A full list of these Job State codes can be found in Slurm’s documentation.

Job Reason Codes

Reason Code Explaination
Priority One or more higher priority jobs is in queue for running. Your job will eventually run.
Dependency This job is waiting for a dependent job to complete and will run afterwards.
Resources The job is waiting for resources to become available and will eventually run.
InvalidAccount The job’s account is invalid. Cancel the job and rerun with correct account.
InvaldQoS The job’s QoS is invalid. Cancel the job and rerun with correct account.
QOSGrpCpuLimit All CPUs assigned to your job’s specified QoS are in use; job will run eventually.
QOSGrpMaxJobsLimit Maximum number of jobs for your job’s QoS have been met; job will run eventually.
QOSGrpNodeLimit All nodes assigned to your job’s specified QoS are in use; job will run eventually.
PartitionCpuLimit All CPUs assigned to your job’s specified partition are in use; job will run eventually.
PartitionMaxJobsLimit Maximum number of jobs for your job’s partition have been met; job will run eventually.
PartitionNodeLimit All nodes assigned to your job’s specified partition are in use; job will run eventually.
AssociationCpuLimit All CPUs assigned to your job’s specified association are in use; job will run eventually.
AssociationMaxJobsLimit Maximum number of jobs for your job’s association have been met; job will run eventually.
AssociationNodeLimit All nodes assigned to your job’s specified association are in use; job will run eventually.

A full list of these Job Reason Codes can be found in Slurm’s documentation.