PP-725: new "keep <select>" option for "pbs_release_nodes"

Follows the PBS Pro Design Document Guidelines.

Overview

This is to enhance the "node ramp down" feature, by introducing a new option "-k <select>" ("k" for "keep") to the pbs command "pbs_release_nodes". This will allow users or admins to retain some of the sister nodes/vnodes which satisfy the "select" argument, while performing node ramp down operation.

Technical Details

Interface 1:  -k <select statement>

  • Change Control: Stable
  • Synopsis: This new option to "pbs_release_nodes" specifies a select statement that is a subset of the job submission (or qalter'ed) select statement which describes the the nodes/vnodes which are to be kept assigned with the job, while releasing the remaining sister nodes/vnodes. The nodes/vnodes released will then be made available for scheduling other jobs. The resource list in a chunk spec of the sub select statement can be a partial one with respect to the full list in the corresponding chunk of the job submission (or qalter'ed) select statement.
  • Permission: as described in the Ref 1. above

Details:

  • New Syntax :

pbs_release_nodes [-j <job ID>] <vnode> [<vnode> [<vnode>] ...]
pbs_release_nodes [-j <job ID>] -a
pbs_release_nodes [-j <job ID>]  -k  <select statement>
pbs_release_nodes --version 
 

  • Example of usage :
    Lets submit a job with a select string

$ qsub -l select=4:model=abc:ncpus=5+3:model=abc:bigmem=true:ncpus=1+2:model=def:ncpus=32  job.scr
120.pbssrv

Now grepping for assigned vnodes we may see :

$ qstat -f 120| egrep exec_vnode
exec_vnode = (nd_abc_1:ncpus=5)+(nd_abc_2:ncpus=5)+(nd_abc_3[0]:ncpus=5)+(nd_abc_3[1]:ncpus=5)+(nd_abc_4_bm:ncpus=1)+(nd_abc_5_bm:ncpus=1)+(nd_abc_6_bm:ncpus=1)+(nd_def_1:ncpus=32)+(nd_def_2:ncpus=32)

Here the first chunk "(nd_abc_1:ncpus=5)" represents the mother superior node while each of the remaining others represent a sister node.

and node statuses as :

$ pbsnodes -av
nd_abc_1
    Mom = nd_abc_1.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = abc
    resources_available.ncpus = 5
    resources_assigned.ncpus = 5

nd_abc_2
    Mom = nd_abc_2.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = abc
    resources_available.ncpus = 5
    resources_assigned.ncpus = 5

nd_abc_3[0]
    Mom = nd_abc_3.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = abc
    resources_available.ncpus = 5
    resources_assigned.ncpus = 5

nd_abc_3[1]
    Mom = nd_abc_3.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = abc
    resources_available.ncpus = 5
    resources_assigned.ncpus = 5

nd_abc_4_bm
    Mom = nd_abc_4_bm.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.bigmem = True
    resources_available.model = abc
    resources_available.ncpus = 1
    resources_assigned.ncpus = 1

nd_abc_5_bm
    Mom = nd_abc_5_bm.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.bigmem = True
    resources_available.model = abc
    resources_available.ncpus = 1
    resources_assigned.ncpus = 1

nd_abc_6_bm
    Mom = nd_abc_6_bm.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.bigmem = True
    resources_available.model = abc
    resources_available.ncpus = 1
    resources_assigned.ncpus = 1

nd_def_1
    Mom = nd_def_1.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = def
    resources_available.ncpus = 32
    resources_assigned.ncpus = 32

nd_def_2
    Mom = nd_def_2.pbspro.com
    state = job-busy
    jobs = 120.pbssrv/0
    resources_available.model = def
    resources_available.ncpus = 32
    resources_assigned.ncpus = 32

Now if we do a pbs_release_nodes with the new "-k" option having a select argument which is a sub statement of select string used in qsub -l :

$ pbs_release_nodes -j 120 -k select=model=abc:ncpus=5+2:model=abc:bigmem=true:ncpus=1

will release the nodes (nd_abc_3[0]:ncpus=5)+(nd_abc_3[1]:ncpus=5)+(nd_abc_6_bm:ncpus=1)+(nd_def_1:ncpus=32)+(nd_def_2:ncpus=32) from the job while retaining the nodes (nd_abc_1:ncpus=5)+(nd_abc_2:ncpus=5)+(nd_abc_4_bm:ncpus=1)+(nd_abc_5_bm:ncpus=1).

The new phase of the job will have below vnodes associated with it 

$ qstat -f 120| egrep exec_vnode
exec_vnode = (nd_abc_1:ncpus=5)+(nd_abc_2:ncpus=5)+(nd_abc_4_bm:ncpus=1)+(nd_abc_5_bm:ncpus=1)

  • Using Partial Chunk Resource List :

The same result in the previous example can be achieved by using the below shorter select string where the resource list is partial one with respect to the original select supplied to qsub.

$ pbs_release_nodes -j 120 -k select=model=abc+2:bigmem=true

  • Errors and Return codes :
    • When the command with the new option executes successfully, the below output is put on the console. With exit code set to 0
             pbs_release_nodes: <sub select string>
    • Cannot be used in conjunction "-a" option. If used so, pbs_release_nodes will print below error along with usage strings.

      pbs_release_nodes: -a and -k options cannot be used together

    • Cannot be used in conjunction with supplying host/vnode list arguments (<vnode> [<vnode> [<vnode>] ...]). If used so, pbs_release_nodes will print below error along with usage strings.

      pbs_release_nodes: cannot supply node list with -k option

    • When the argument string to "-k" option doesn't start with "select=" string
              pbs_release_nodes: only a "select=" string is valid in -k option
    • When the sub select statement supplied contains undefined resources
              pbs_release_nodes: Unknown resource: <undefined res name>
    • For all other failures, including non-satisfaction of the sub select string, the below error will get printed
              pbs_release_nodes: Server returned error 15010 for job
  • Accounting Logs :
    • No new accounting logs introduced. See Ref 2. above.
  • Caveats :
    • The order of selection of nodes/vnodes to be released or kept by the "-k <select>" option is "Undefined". Hence user/admin or his/her scripts/tools should not depend/predict on the order of release/keep operation on the nodes/vnodes.
    • If one or more nodes/vnodes targeted for release have one or more job chunks/processes still running in them, then the release operation will result in their abrupt termination.
    • Clubbing the previous two caveats: user/admin should be aware that by using this new option, the running job may lose some of its running job chunks.
    • Since the mother superior cannot be ramp-ed down, the sub string of select resource request associated with the mother superior will be internally appended to the sub select string supplied with "-k <select>".


API level details:

  • The "select" string parameter will be passed to "pbs_relnodesjob()" using its "extend" argument which is of type "char * "





OSS Site Map

Project Documentation Main Page

Developer Guide Pages