Node maintenance window enhancement

It is a common use case that nodes need some maintenance and the admin knows the maintenance window in advance. It is difficult to plan such a maintenance window(s) in the PBS now. This new feature enhances reservations in order to provide proper maintenance windows.

Interface: New '--hosts' option to PBS command 'pbs_rsub'

  • Visibility: public
  • Change Control: Stable
  • Synopsis: The new option allows submitting a special reservation and this reservation is allowed to 'run' on unavailable nodes.
  • Details: pbs_rsub with '--hosts' option is allowed to be run only by managers and operators. The resources 'place' and 'select' are generated automatically and they are forbidden to combine with '--hosts'. Combining these resources with '--hosts' results in printing 'usage' help. 
    • The syntax of pbs_rsub with '--hosts' option requires list of hosts: ' <host1> <host2> <host3> ...'
    • The placement of this reservation is always: '-l place=exclhost'
    • The select is generated by the hosts like this: '-l select=host=<host1>:ncpus=<ncpus_host1>+host=<host2>:ncpus=<ncpus_host2>+host=<host3>:ncpus=<ncpus_host3>+...'
    • The resv_nodes of this reservation is created in order to request all the ncpus of all vnodes on requested hosts.
    • This reservation is confirmed immediately after submission by the pbs_rsub command and overlapping reservations are degraded and will be reconfirmed in the next scheduler iteration.
    • The resv_nodes of overlapping reservations is modified and the requested vnodes are removed from the resv_nodes. This means that for running reservations no new job will start on overlapping nodes.
    • Overlapping running jobs are ignored and it is up to the administrators to deal with these jobs.
    • Reservation submitted with '--hosts' ignores resv_enable attribute on nodes.
    • The reservation prefix is 'M', which stands for maintenance.
    • Submitting this reservation will not invoke the scheduler iteration.

Interface: New extend parameter 'm' to IFL function 'char *pbs_submit_resv(int connect, struct attropl *attrib, char *extend)'

  • Visibility: public
  • Change Control: Stable
  • Synopsis: The new extend parameter modifies reservation id prefix.
  • Details: When the extend parameter includes 'm', the returned reservation id is prefixed with 'M'. This is available only for managers and operators and for others the PBSE_PERM (15007) is returned.

Interface: New reservation substate 'RESV_IN_CONFLICT'

  • Visibility: public
  • Change Control: Stable
  • Synopsis: The substate means that reservation is overlapping the 'M'-reservation.
  • Details: The RESV_IN_CONFLICT (shortcut is 'IC') has a similar impact as substate RESV_DEGRADED. The difference is that the reservation with the new substate is reconfirmed even if all the nodes of the reservation are up.