This is a design proposal to configure PBS in a way that it releases only limited number of resources (as specified by the admin) when a job is suspended.
...
- Visibility: Public
- Change Control: Stable
- Details:
- A new server attribute “restrict_rescres_to_release_on_suspsuspend” is a comma separated list of resource names. The resources that get released on suspension will be restricted to the resources listed in "restrict_rescres_to_release_on_suspsuspend".
- This server attribute is of type “array_string” and can only be set by a manager.
- If a manager is trying to set the attribute with a resource which is non existent then following error will be thrown on the console by qmgr command -
# qmgr -c "s s restrict_rescres_to_release_on_susp suspend = ‘ncpus, abcd'"
qmgr obj=abcd svr=default: Unknown resource
...
- If unset, after suspending the job PBS will release all the consumable resources requested by the job.
- By default this attribute is unset.
- PBS manager can also add/remove resources to/from "restrict_rescres_to_release_on_suspsuspend" attribute by using "+="/"-=" operators.
- The resources specified in this new server attribute will be released (provided job has requested for them) every time a job is suspended (by preemption or qsig).
...
•It stores a string that depicts the amount of resources that are released on each chunk that the job was running on (provided these resources are also part of “restrict_rescres_to_release_on_susp” suspend” string). The format of the string is similar to that of exec_vnode
...
•This job attribute is populated at the time of job suspension only if “restrict_rescres_to_release_on_susp” suspend” server attribute is set and has a list of legitimate resources to be released.
...
•It stores the cumulative value of all the consumable resources requested by the job (provided these resources are also part of “restrict_rescres_to_release_on_susp” suspend” string).
using example in interface 2: qstat -f 1 | grep resource_released_list
...
•This job attribute is populated only if “restrict_rescres_to_release_on_susp” suspend” server attribute is set and has a list of legitimate resources to be released.
...
- Visibility: Public
- Change Control: Stable
- Details:
- If an admin tries to delete a custom resource that is part of the restrict_rescres_to_release_on_susp suspend server attribute then qmgr command will fail with “resource busy” error code.
...