Add your comments in the Discussion Forum.


This design only applies to Cray ALPS systems.

On a Cray ALPS system, when a job's script is finished running PBS sends to ALPS a release reservation request.  PBS then will intermittently poll until the ALPS response is "No entry for resId" is received.  This is the indication to PBS that the ALPS reservation has successfully been canceled.

What happens today

Today, the amount of time between when PBS will send an ALPS release reservation request will grow exponentially with each try.  PBS also randomly adds between 0-4 seconds to each interval as jitter.  The jitter is so that in the case that the jobs all end at the same time, PBS will not overwhelm ALPS with reservation release requests all at once.  The jitter helps to randomly make each ALPS reservation release happen at a different interval.  Thus the total time between ALPS release reservation requests was the combination of the base loop exponent result plus the value randomly generated between 0-4.  Both of these timings for the interval and the jitter were requested by Cray.

New proposal

Cray says that things have changed and we should now be able to poll at a different interval.  This way, the job's ALPS reservation being released can be discovered sooner, and the next job can use those resources sooner.  The best way for PBS to handle this, will be to put the control in the PBS administrator's hands.   2 new mom tunables will allow the PBS administrator to individually adjust the base interval value, and the amount of potential jitter added to the total interval time.  Total interval time is determined by adding the value for alps_release_wait_time + the randomly generated value based off alps_release_jitter.  The minimum wait time interval is implementation dependent and may be different for different versions of ALPS and PBS Pro.  The supplied value may be adjusted (rounded or truncated) based on the available resolution.

Tunable 1 - alps_release_wait_time
Tunable 2 - alps_release_jitter

Turned this into a tunable to the PBS administrator could choose to increase or decrease the amount of jitter added to the interval




OSS Site Map

Developer Guide Pages


Ignore this.  We may use it later for page characterization.