Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview:

There are many events in the lifecycle of a job. This EDD is focused on logging accounting records related to suspend and resume events of the job. Currently, job's resource usage is available when the job ends and there is "E" record in the accounting logs for the same. The resource requested/usage can change during the life of the job. The objective of this EDD is to understand the correct usage of resources during the suspend and resume events. 

'z' record:

  • Upon job suspension, a 'z' record shall be accounted. 
  • The record shall consist of:
    • resources_released(if available)
    • resources_used

Example:

1. resources_released list shall only be available if the server attribute restrict_res_to_release_on_suspend is set.
"qmgr -c 's s restrict_res_to_release_on_suspend+=ncpus'"

a. Submit a job requesting ncpus

0304/2204/2020 1800:0456:2106;z;01003.pbsserver;resources_released=(pbsserver:ncpus=1) resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=3240kb 0kb resources_used.ncpus=1 2 resources_used.vmem=336452kb 0kb resources_used.walltime=00:00:3000 resources_released=(pbsserver:ncpus=2)

b. Submit a job requesting ncpus and memory
03
04/2304/2020 1900:4455:3412;z;71002.pbsserver;resources_released=(pbsserver:ncpus=4) resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=0kb 3432kb resources_used.ncpus=4 1 resources_used.vmem=0kb 336452kb resources_used.walltime=00:00:0032 resources_released=(pbsserver:ncpus=1)


2. restrict_res_to_release_on_suspend is unset
03
04/
2304/2020 1900:1652:0545;z;61002.pbsserver;resources_used.cpupercent=0 resources_used.cput=00:00:00 resources_used.mem=3236kb 0kb resources_used.ncpus=1 resources_used.vmem=336452kb 0kb resources_used.walltime=00:00:2700

'r' record

  • Upon resuming a job, an 'r' record shall be accounted.

Example:

0304/2204/2020 1800:0554:1743;r;01002.pbsserver;