...
- Visibility: Public
- Change Control: Stable
- Details:
- Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
- "PBS_START_SCHED" pbs.conf variable is now deprecated and it's value will get overridden by schedulers "scheduling" attribute.
- PBS server will connect to these schedulers on their respective host names and port number.
- Scheduling cycles for all configured schedulers are started by PBS server when a job is queued, finished, when scheduling attribute is set to True or when scheduler_iteration is elapsed.
- When a job gets queued or finished, server will check it's corresponding queue and try to connect to it's corresponding scheduler to run a scheduling cycle.
- If a scheduler is already running a scheduling cycle while server will just wait for the previous cycle to finish before trying to start another one.
- If job_accumulation_time is set then server will wait until that time has passed after the submission of a job before starting a new cycle.
- Each scheduler queries whole universe of all schedulers, server, queues, nodes information(This is to avoid IFL changes) etc. from server. Thereafter it does the following.
- It filters all the running, queued, exiting jobs from the queues it is associated with its partition/s.
- It filters all the list of nodes which are associated with the partition/s managed by the scheduler.
- It filters the list of all the global policies like run soft/hard limits set on the server object.
- PBS's init script will now be reporting status of pbs server only. Schedulers will be managed by server and their status can be fetched using a qmgr command.
- When pbs_server daemon is stopped using "qterm -s" then, it will also stop all the running scheduler processes.
- pbs init script while shutting down pbs_server will use the "-s" option to qterm so that all schedulers also come down along with server.
- Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
Interface 8: Changes to pbs_rsub commandReservations
- Visibility: Public
- Change Control: Stable
- Details:
- Reservations can now be submitted to a specific partition using a new "-p" option with pbs_rsub command.
- "-p" option in pbs_rsub command takes partition name as input and makes pbs_server to trigger a scheduling cycle of the scheduler that is servicing the partition.
- If a scheduler servicing the requested partition isn't up and running then pbs server will store the reservation with itself and mark it as "UNCONFIRMED" until it is able to trigger a scheduling cycle of the said schedulerIn a Multi-sched environment, reservations can be confirmed by any scheduler servicing their respective partitions.
- After the reservations are confirmed they are assigned the partition their node solution came from.
- A scheduler servicing 2 partitions P1, P2 would try to confirm a reservation on nodes from either P1 or P2 but not a mix of both. Once the reservation is confirmed, it has a partition attribute set on it to identify where it was confirmed. Similarly, the reservation queue also gets a partition attribute set on it (matching the reservation).
- example,
% pbs_rsub -lselect=1:ncpus=2 -R 1030 -D1200 -I 5
R865.centos CONFIRMED% pbs_rstat -f R865
Resv ID: R865.centos
Reserve_Name = NULL
Reserve_Owner = root@centos
reserve_type = 2
reserve_state = RESV_CONFIRMED
reserve_substate = 2
reserve_start = Mon Feb 03 10:30:00 2020
reserve_end = Mon Feb 03 10:50:00 2020
reserve_duration = 1200
queue = R865
Resource_List.ncpus = 2
Resource_List.nodect = 1
Resource_List.select = 1:ncpus=2
Resource_List.place = free
Resource_List.walltime = 00:20:00
schedselect = 1:ncpus=2
resv_nodes = (vnode2:ncpus=2)
Authorized_Users = root@centos
server = centos
ctime = Mon Feb 03 10:08:55 2020
mtime = Mon Feb 03 10:09:03 2020
interactive = 5
Variable_List = PBS_O_LOGNAME=root,PBS_O_HOST=centos,PBS_O_MAIL=/var/spool/mail/arung,PBS_TZID=America/Los_Angeles
euser = root
egroup = root
partition = P3% qmgr -c "l q R865"
Queue R865
queue_type = Execution
total_jobs = 0
state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0
acl_user_enable = True
acl_users = root@centos
resources_max.ncpus = 2
resources_available.ncpus = 2
enabled = True
started = False
partition = P3
- example,
Once a reservation is confirmed and partition is assigned to it, it can not be re-confirmed or altered in any other partition.
Reservations (and their queues) confirmed by the default scheduler are marked under partition "pbs-default".
PBS assigns a default partition name "pbs-default" to all the reservations (and their queues) confirmed by the default scheduler. If an admin tries to assign a scheduler/queue/node partition name "pbs-default", qmgr command throws error - "Default partition name is not allowed".
Interface 9: Deleted
Interface 10: Fairshare
...