Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
      • "PBS_START_SCHED" pbs.conf variable is now deprecated and it's value will get overridden by schedulers "scheduling" attribute.
    • PBS server will connect to these schedulers on their respective host names and port number.
    • Scheduling cycles for all configured schedulers are started by PBS server when a job is queued, finished, when scheduling attribute is set to True or when scheduler_iteration is elapsed.
      • When a job gets queued or finished, server will check it's corresponding queue and try to connect to it's corresponding scheduler to run a scheduling cycle.
      • If a scheduler is already running a scheduling cycle while server will just wait for the previous cycle to finish before trying to start another one.
      • If job_accumulation_time is set then server will wait until that time has passed after the submission of a job before starting a new cycle.
    • Each scheduler queries whole universe of all schedulers, server, queues, nodes information(This is to avoid IFL changes) etc. from server. Thereafter it does the following.
      • It filters all the running, queued, exiting jobs from the queues it is associated with its partition/s.
      • It filters all the list of nodes which are associated with the partition/s managed by the scheduler.
      • It filters the list of all the global policies like run soft/hard limits set on the server object.
    • PBS's init script will now be reporting status of pbs server only. Schedulers will be managed by server and their status can be fetched using a qmgr command.
      • When pbs_server daemon is stopped using "qterm -s" then, it will also stop all the running scheduler processes.
      • pbs init script while shutting down pbs_server will use the "-s" option to qterm so that all schedulers also come down along with server.


Interface 8: Changes to pbs_rsub commandReservations

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Reservations can now be submitted to a specific partition using a new "-p" option with pbs_rsub command.
    • "-p" option in pbs_rsub command takes partition name as input and makes pbs_server to trigger a scheduling cycle of the scheduler that is servicing the partition.
    • If a scheduler servicing the requested partition isn't up and running then pbs server will store the reservation with itself and mark it as "UNCONFIRMED" until it is able to trigger a scheduling cycle of the said schedulerIn a Multi-sched environment, reservations can be confirmed by any scheduler servicing their respective partitions.
    • After the reservations are confirmed they are assigned the partition their node solution came from.
    • A scheduler servicing 2 partitions P1, P2 would try to confirm a reservation on nodes from either P1 or P2 but not a mix of both. Once the reservation is confirmed, it has a partition attribute set on it to identify where it was confirmed. Similarly, the reservation queue also gets a partition attribute set on it (matching the reservation).
      • example,

        % pbs_rsub -lselect=1:ncpus=2 -R 1030 -D1200 -I 5
        R865.centos CONFIRMED

        % pbs_rstat -f R865
        Resv ID: R865.centos
        Reserve_Name = NULL
        Reserve_Owner = root@centos
        reserve_type = 2
        reserve_state = RESV_CONFIRMED
        reserve_substate = 2
        reserve_start = Mon Feb 03 10:30:00 2020
        reserve_end = Mon Feb 03 10:50:00 2020
        reserve_duration = 1200
        queue = R865
        Resource_List.ncpus = 2
        Resource_List.nodect = 1
        Resource_List.select = 1:ncpus=2
        Resource_List.place = free
        Resource_List.walltime = 00:20:00
        schedselect = 1:ncpus=2
        resv_nodes = (vnode2:ncpus=2)
        Authorized_Users = root@centos
        server = centos
        ctime = Mon Feb 03 10:08:55 2020
        mtime = Mon Feb 03 10:09:03 2020
        interactive = 5
        Variable_List = PBS_O_LOGNAME=root,PBS_O_HOST=centos,PBS_O_MAIL=/var/spool/mail/arung,PBS_TZID=America/Los_Angeles
        euser = root
        egroup = root
        partition = P3

        % qmgr -c "l q R865"
        Queue R865
        queue_type = Execution
        total_jobs = 0
        state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0
        acl_user_enable = True
        acl_users = root@centos
        resources_max.ncpus = 2
        resources_available.ncpus = 2
        enabled = True
        started = False
        partition = P3

    • Once a reservation is confirmed and partition is assigned to it, it can not be re-confirmed or altered in any other partition.

    • Reservations (and their queues) confirmed by the default scheduler are marked under partition "pbs-default".

    • PBS assigns a default partition name "pbs-default" to all the reservations (and their queues) confirmed by the default scheduler. If an admin tries to assign a scheduler/queue/node partition name "pbs-default", qmgr command throws error - "Default partition name is not allowed".

Interface 9: Deleted

Interface 10: Fairshare

...