https://pbspro.atlassian.net/browse/PP-864
Overview:
Cray X* series systems have the ability to support suspending one or more jobs to run a higher priority job. PBS needs to modify the suspend pseudo signal (used by the qsig command and preemption) to support doing suspend and resume on a Cray X* series.
Important things to note:
- Cray systems with a Gemini interconnect do NOT support suspend/resume
- Cray systems with an Aries interconnect and newer Cray X* series systems DO support suspend/resume
- in order to do suspend/resume set suspendResume 1 in /etc/opt/cray/alps/alps.conf (using xtopview on CLE 5.2 and prior CLEs) and then restart ALPS
- Please refer to Cray's System Administration Guide for more details about using suspend/resume on Cray X* series
Interface #1 -
Log messages that will appear in the MoM logs:
MoM log message #1: "ALPS reservation <ALPS reservation ID> SWITCH status is = 'EMPTY'"
- Unstable
- Logged at PBSEVENT_DEBUG2
- It is possible to incorrectly get an 'EMPTY' response (which means there is no claim on the ALPS reservation) when in reality there is a claim on the ALPS reservation. PBS will print this log message so it is possible to see how often the false 'EMPTY' response is received.