Objective:
As of today if the server_dyn_res script does not return or hangs. The scheduler keeps on waiting for the script to complete the execution. Thus scheduler keeps waiting.
The objective of this design document is to propose the solution for this hang issue.
Interface 1: New Configurable Scheduler attribute: server_dyn_res_prog_timeout
- Visibility: Public
- Change Control: Stable
- Details:
- Admin can configure the scheduler attribute "server_dyn_res_prog_timeout". Default is 60 seconds.
- Usage :
qmgr -c "set sched server_dyn_res_prog_timeout = 15"
- PBS will start polling from the time the server_dyn_res script starts executing and will wait for "server_dyn_res_prog_timeout" time. After the timeout the interaction with the script will end and scheduler will log a timeout info message.
- PBS will start polling from the time the server_dyn_res script starts executing and will wait for "server_dyn_res_prog_timeout" time. After the timeout the interaction with the script will end and scheduler will log a timeout info message.
Interface 2: Log messages
- Visibility: Public
- Change Control: Stable
- Details:
- Once the timeout is reached a timeout info message is logged in the scheduler logs. Something like as follows :
pbs_sched;Svr;server_dyn_res;server_dyn_res program timed out
- Once the timeout is reached a timeout info message is logged in the scheduler logs. Something like as follows :