Job Monitoring - squeue

Sitename

Job Monitoring - squeue

In Slurm the main way to monitor your job is with the squeue command.

> squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
29 cosma7 job_name arj R 16:59 110 m[7001-7083,7085-7111]

Most of the information is straight forward, but column 5 relates to the state of the job where R-Running, PD=Pending, CA=Cancelled, CF=Configuring, CG=Completing, CD=Completed, F=Failed. The sixth column shows the elapsed time, and final is the list of assigned nodes.

When using the squeue command there ate some options:

-u   Request jobs or job steps from a comma separated list of users. The list can consist of user names or user id numbers. Performance of the command can be measurably improved for systems with large numbers of jobs
when a single user is specified.
--start Report the expected start time and resources to be allocated for pending jobs in order of increasing start time. This is equivalent to the following options: --format="%.18i %.9P %.8j %.8u %.2t %.19S %.6D %20Y %R",
--sort=S and --states=PENDING. Any of these options may be explicitly changed as desired by combining the --start option with other option values (e.g. to use a different output format). The expected start time of
pending jobs is only available if the Slurm is configured to use the backfill scheduling plugin.
-j  Requests a comma separated list of job IDs to display. Defaults to all jobs. The --jobs= option may be used in conjunction with the --steps option to print step information about specific jobs. Note:
If a list of job IDs is provided, the jobs are displayed even if they are on hidden partitions. Since this option's argument is optional, for proper parsing the single letter option must be followed immediately with
the value and not include a space between them. For example "-j1008" and not "-j 1008". The job ID format is "job_id[_array_id]". Performance of the command can be measurably improved for systems with large numbers
of jobs when a single job ID is specified. By default, this field size will be limited to 64 bytes. Use the environment variable SLURM_BITSTR_LEN to specify larger field sizes.