Interactive Jobs¶
Allocation¶
salloc is used to allocate resources in real
time to run an interactive batch job. Typically, this is used to
allocate resources and spawn a shell. The shell is then used to
execute srun commands to launch parallel tasks.
"interactive" QOS on Perlmutter¶
Perlmutter has a dedicated interactive QOS to support medium-length interactive work. This QOS is intended to deliver nodes for interactive use within 6 minutes of the job request.
Warning
On Perlmutter, if you have not set a default account, salloc may fail with the following error message:
Perlmutter GPU nodes¶
When using srun, you must explicitly request for GPU resources
One must use the --gpus (-G), --gpus-per-node, or --gpus-per-task flag to make
the allocated node's GPUs visible to your srun command.
Otherwise, you will see errors / complaints similar to:
When requesting for an interactive node on the Perlmutter GPU compute nodes
One must use the project name that ends in _g (e.g., mxxxx_g) to submit any jobs
to run on the Perlmutter GPU nodes. The -C (constraint flag) must also be set to
GPUs for any interactive jobs (-C gpu or --constraint gpu).
Otherwise, you will notice errors such as:
Perlmutter CPU nodes¶
Limits¶
If resources are not readily available for the requested interactive job, it is
automatically canceled after 6 minutes. To allow a job to wait for resources
for a longer time, one can use optional the --immediate flag to specify the
number of seconds that the job should wait for available resources:
# wait for up to 10 minutes
salloc --nodes 1 --qos interactive --time 01:00:00 --constraint <node type> --account mxxxx --immediate=600
There is a maximum node limit of 4 nodes for interactive jobs on
both cpu and gpu partitions. For more details see QOS Limits and Charges.