Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Interface 1: Extend PBS to support a list of scheduler objects

    • Visibility: Public
    • Change Control: Stable
    • Details:
      • PBS supports a list of scheduler objects to be created using qmgr. It is similar to how we create nodes in server.
      • qmgr command can be used to create a scheduler object . It must be invoked by a PBS admin/manager.
      • To create a scheduler object and make it run, the following are the mandatory attributes that needs to can be set by the user
        • Name of the scheduler is mandatory to be given while creating a scheduler object. 
          • qmgr -c "c sched multi_sched_1"
          • The directory must be root owned and should have permissions as "750". By default a sched object has 
            it
            • This will create/set the following attributes for the sched object
              • port - If not defined by the user, It will start from 15050 and try to run the scheduler on the next available port number.
              • host (read-only for now, Has the same value as PBS server host)
              • queues = None (default)sched_port - port number on which multi_sched_1 is going to run. This is mandatory parameter and should be given before start of this scheduler.
              • sched_host - host name on which multi_sched_1 is going to run. This is mandatory parameter and should be given before start of this scheduler.
              • partition = "None" (default)*
              • sched_priv = $PBS_HOME/sched_priv_multi_sched_1_priv (default)*
              • sched_log = $PBS_HOME/sched_log_multi_sched_1_log (default)*
              • scheduling = False (default)*
              • scheduler_iteration = 600 (default)
          Set the priv directory for the scheduler.
              • *
              • comment 
                • sites can use the comment field to

                • notify them if scheduler undergoes restarting 2-3 times due to potential crashes in an hour for example (i.e. comment => “NEEDS_ATTENTION”)

                • tell when a particular scheduler is ready to function again by setting the comment as follows.

                  • comment => “READY_TO_USE


        • "*" indicates that the value will be visible when the admin lists or prints the sched object after the sched object is create
  • Set the priv directory for the scheduler.
    • The directory should have permissions set as "750". By default a sched object has it's priv directory set as $PBS_HOME/
  • <sched-name>_priv
    • sched_priv_<sched-name>. If the directory is already used by some other scheduler then error code is set "15216" with error message 
      "Another scheduler has same value for its sched_priv directory"
    • qmgr -c "s sched multi_sched_1 sched_priv=/var/spool/pbs/sched_priv_1"
  • Set
    • If the
  • log directory for the scheduler. The directory must be root owned and
    • priv directory is not accessible by scheduler process, or the scheduler files are not found in the directory, then comment is updated with following error message
      "PBS failed validation checks for sched_priv directory"
    Set the log directory for the scheduler. 
    • The directory should have permissions as "755". By default a sched object
  • has 
    it
    • has it's logs directory set as $PBS_HOME/
  • <sched_name>_logsqmgr -c "s
    • sched
  • multi
    • _
  • sched
    • log_
  • 1 sched_log=/var/
    • <sched_name>. . If the directory is already used by some other scheduler then error code is set "15215"  with error message
      "Another scheduler has same value for its sched_log directory"
    • qmgr -c "s sched multi_sched_1 sched_log=/var/spool/pbs/sched_
  • logs
    • log"
  • To set scheduling on one of the newly created scheduler object one must make use of scheduler name. 
    • If the log directory is not accessible by scheduler process, or the scheduler files are not found in the directory, then comment is updated with following error message
      "Unable to change the sched_log directory"
    • By default a multi-sched object has scheduling set as False.
  • If no name is specified then PBS server will enable/disable scheduling on default scheduler.qmgr -c " s sched <scheduler name> scheduling = 1"
    • qmgr -c " s sched <scheduler name> scheduling = 1"
  • The following attributes will be set on the default scheduler if the the user sets them on the server
    • scheduling
    • scheduling_iteration
  • Max length of scheduler name is 15
  • By default PBS server will configure a default scheduler which will run out of the box.
      • The name of this default scheduler will be "
  • pbs_sched
      • default"
      • The sched_priv directory of this default scheduler will be set to the $PBS_HOME/sched_priv
      • Default scheduler will log in $PBS_HOME/sched_
  • logs
      • log directory.
  • Interface 2: Changes to PBS scheduler
  • Visibility: Public
  • Change Control: Stable
  • Details:
  • Scheduler now has additional attributes which can be set in order to run it.
    • sched_priv - to point to the directory where scheduler keeps the fairshare usage, resource_group, holidays file and sched_config
    • sched_logs - to point to the directory where scheduler logs.
    • policy - collection of various attributes (as mentioned below) which can be used to configure scheduler.
    • queues - list of all the queues for which this scheduler is going to schedule jobs.
    • host - hostname on which scheduler is running. For default scheduler it is set to pbs server hostname.
    • port - port number on which scheduler is listening.
    • job_accumulation_time - amount of time server will wait after the submission of a job before starting a new cycle.
    • state - This attribute shows the status of the scheduler. It is a parameter that is set only by pbs server.
  • One can set a queue or a list of queues to
      • Default scheduler will be provided with default set of policies as mentioned in sched_config.
      • One can set a scheduler attribute through qmgr either in the usual way as shown below or they can use the new syntax. Old syntax is supported for backward compatibility which will be deprecated soon.
        Ex:  qmgr -c "set sched job_sort_formula_threshold = <value>
               qmgr -c "set sched default job_sort_formula_threshold = <value>"

Interface 2: Changes to PBS scheduler

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Scheduler now has additional attributes which can be set in order to run it.
      • sched_priv - to point to the directory where scheduler keeps the fairshare usage, resource_group, holidays file and sched_config
      • sched_log - to point to the directory where scheduler logs.
      • partition - name of the partition for which this scheduler is going to schedule jobs.
      • sched_host - hostname on which scheduler is running. For default scheduler it is set to pbs server hostname.
      • sched_port - port number on which scheduler is listening.
      • state - This attribute shows the status of the scheduler. It is a parameter that is set only by pbs server.
    • One can assign only one partition per scheduler object. Once set, given scheduler object will only schedule jobs from the queues attached to specified
  • .
    • partition"
      • qmgr -c "s sched multi_sched_1
  • queues=hp_queue1,hp_queue2
      • partition='part1'"
    • If no
  • queues are
    • partition is specified
  • with
    • for a given scheduler object, other than the default scheduler where no partition value can be set, then that scheduler will not schedule any jobs.
    • By default, All new queues created will be
  • attached to
    • scheduled by the default scheduler,
  • unless specified otherwise
    • until they have been assigned to a specific partition.
    • A
  • queue
    • partition once attached to a scheduler can not be attached to
  • another
    • a second scheduler without removing it from the first scheduler. If tried, then it will throw following error:
      • qmgr -c "s sched multi_sched_1
  • queues
      • partition=
  • workq
      • 'part2'"
  • Queue workq
      • Partition part2 is already associated with scheduler
  • <sched_name>.Scheduler can now accept a set of policy that it can work on:Policy can be specified by using - qmgr -c "s sched <sched_name> policy=<policy object>" command
      • <scheduler name>.
    • Scheduler object "state" attribute will show one of these 3 values  -
  • DOWN
    • down,
  • IDLE
    • idle,
  • SCHEDULING
    •  scheduling 
      • If a scheduler object is created but scheduler is not running for some reason state will be shown as "
  • DOWN
      • down"
      • If a scheduler is up and running but waiting for a cycle to be triggered the state will be shown as "
  • IDLE
      • idle"
      • If a scheduler is up and running and also running a scheduling cycle then the state will be shown as "
  • SCHEDULING"
    Interface 3: New policy object
  • Visibility: Public
  • Change Control: Stable
  • Details:
  • Admins will now be allowed to create policy objects and give a name to these policy object.
  • Admins can then assign these policy objects to specific schedulers, they can have one policy object assigned to more than one scheduler.
  • One can delete a policy object only when it is not assigned to any scheduler.
  • Example: 
    qmgr -c "c policy p1"
    qmgr -c "s p p1 by_queue=False, strict_ordering=True"
    qmgr -c "s sched scheduler1 policy=p1"

    Below is the list of policies that reside in the policy attribute of scheduler.

    Policy nameTypeDefault valueexampleround_robinBooleanround_robin=Falseqmgr -c "s policy p1 round_robin=True"by_queueBooleanby_queue=Trueqmgr -c "s policy p1 by_queue=True"strict_orderingBooleanstrict_ordering=Falseqmgr -c "s policy p1 strict_ordering=True"help_starving_jobsBooleanhelp_starving_jobs=Trueqmgr -c "s policy p1 help_starving_jobs=True"max_starvestringmax_starve="24:00:00"qmgr -c "s policy p1 max_starve=24:00:00"node_sort_keyarray_stringnode_sort_key = "sort_priority HIGH"qmgr -c 's policy p1 node_sort_key="sort_priority HIGH, ncpus HIGH"'provision_policystringprovision_policy="aggressive_provision"qmgr -c "s policy p1 provision_policy="aggressive_provision"exclude_resourcesarray_stringNOT SET BY DEFAULTqmgr -c 's policy p1 exclude_resources="vmem, color"'load_balancingBooleanload_balancing=Falseqmgr -c "s policy p1 load_balancing=True"fairshareBooleanfairshare=Falseqmgr -c "s policy p1 fairshare=True"fairshare_usage_resstringfairshare_usage_res=cputqmgr -c "s policy p1 fairshare_usage_res=cput"fairshare_entitystringfairshare_entity=euserqmgr -c "s policy p1 fairshare_entity=euser"fairshare_decay_timestringfairshare_decay_time="24:00:00"qmgr -c "s policy p1 fairshare_decay_time=24:00:00"fairshare_enforce_no_sharesBooleanfairshare_enforce_no_shares=Trueqmgr -c "s policy p1 fairshare_enforce_no_shared=True"preemptionBooleanpreemption=Trueqmgr -c "s policy p1 preemption=True"preempt_queue_priointegerpreempt_queue_prio=150qmgr -c "s policy p1 preempt_queue_prio=190"preempt_priostringpreempt_prio="express_queue, normal_jobs"qmgr -c 's policy p1 preempt_prio="starving_jobs, normal_jobs, starving_jobs+fairshare"'preempt_orderstringpreempt_order="SCR"qmgr -c 's policy p1 preempt_order="SCR 70 SC 30"'preempt_sortstringpreempt_sort="min_time_since_start"qmgr -c 's policy p1 preempt_sort="min_time_since_start"'peer_queuearray_stringNOT SET BY DEFAULTqmgr -c 's policy p1 peer_queue=" workq workq@svr1"server_dyn_resarray_stringNOT SET BY DEFAULTqmgr -c 's policy p1 server_dyn_res="mem !/bin/get_mem"'dedicated_queuesarray_stringNOT_SET_BY_DEFAULTqmgr -c 's policy p1 dedicated_queues="queue1,queue2"'log_eventintegerlog_event=3328qmgr -c "s policy p1 log_event=255"job_sort_formulastringNOT SET BY DEFAULTqmgr -c 's policy p1 job_sort_formula="ncpus*walltime"'backfill_depthintegerSet to 1 by defaultqmgr -c 's policy p1 backfill_depth=1'job_sort_keyarray_stringNOT_SET_BY_DEFAULTqmgr -c 's policy p1 job_sort_key="ncpus HIGH, mem LOW"'prime_spillstringNOT_SET_BY_DEFAULTqmgr -c 's policy p1 prime_spill="01:00:00"'prime_exempt_anytime_queuesBooleanprime_exempt_anytime_queues=falseqmgr -c 's policy p1 prime_exempt_anytime_queues=false'backfill_primeBooleanbackfill_prime=falseqmgr -c 's policy p1 backfill_prime=false'
  • Following are the configurations that are moved/removed:
    • mom_resources - removed (mom periodic hooks can update custom resources)
    • unknown_shares - moved to resource_group file.
    • smp_cluster_dist - It was already deprecated, removed now
    • sort_queues - It was already deprecated, removed now
    • nonprimetime_prefix - New policy object does not differentiate between prime/non-prime time 
    • primetime_prefix - New policy object does not differentiate between prime/non-prime time 
    • resources - New policy object will now list the resources that needs to be excluded from scheduling. By default all resources will be used for scheduling.
    • dedicated_prefix - New policy object will expose "dedicated_queues" which is a list of queues associated with dedicated time.
    • preemptive_sched - This has been renamed to "preemption".
    • log_filter - log_filter has been renamed to "log_event" to be in sync with the option server object exposes.
  • Admin will now be allowed to add different policy object for prime/non-prime time. 
  • If the values of "policy" scheduler attribute is prefixed with "p:", it will be considered as prime-time policy.
  • If the values of "policy" scheduler attribute is prefixed with "np:", it will be considered as non-prime-time policy.
  • Policy name specified without any prefix will be used as all time policy.
  • More than one policy object can be specified at the same time in policy scheduler attribute.
  • example: qmgr -c "s sched sched1 policy=p:p1,np:p2"
  • a primetime policy/non-primetime policy/all time policy can not be specified more than once while setting scheduler's policy attribute.
  • If one wants to use policies mentioned under old sched config file then they need to keep a copy of the config file in the directory mentioned under "sched_priv" attribute.
  • If both policy and sched_config files are present then sched_config file will be ignored.
  • One can unset all the policies in one shot using "qmgr -c "unset sched <sched_name> policy" and this will make scheduler read the sched_config file in the next iteration.
    Interface 4: Changes to PBS server.
  • Visibility: Public
  • Change Control: Stable
  • Details:
  • PBS does not allow attributes like scheduling, scheduler_iteration to be set on PBS server object.
  • scheduling and scheduler_iteration now belong to the sched object
  • backfill_depth will also be an attribute of scheduler's policy object. 
    • If scheduler is configured to use sched_config instead of policy object, then it will take value of backfill_depth from scheduler object. If not set on scheduler object then it will take what is set on the server object (We should deprecate backfill_depth on the server object).
    • If scheduler is configured to use policy object instead of sched_config file, then it will take value of backfill_depth from scheduler's policy object.
    • If there is backfill_depth set on per queue level then that value will take precedence over the value set in sched object or server object.
  • These attributes now belong to a scheduler object and needs to be set on scheduler object using a scheduler name
    • qmgr -c "s policy p1 backfill_depth=3"
    • qmgr -c "s sched multi_sched_1 policy = p1"
  • Setting these attributes on server will result into following warning:
    • qmgr -c "s s backfill_depth=3"
    • qmgr: Warning: backfill_depth in server is deprecated. Set backfill_depth in a scheduler policy object.                               
  • If no scheduler name is specified then also it will throw the following error:
    • qmgr -c "s sched policy.backfill_depth=3"
      No scheduler specified, nothing done
  • Attribute job_sort_formula has been moved from server to scheduler policy attribute.
    Interface 5: Changes to PBS Nodes objects.
  • Visibility: Public
  • Change Control: Stable
  • Details:
  • Each of the node object in PBS will have an additional attribute called "sched" which can be used to associate a node to a particular scheduler.
  • This attribute will by default be set to the default scheduler started by the server (which is pbs_sched)
  • PBS admin/manager can set node's sched attribute to an existing scheduler name which will be scheduling jobs on this node.
  • When a scheduler object is deleted all the queues/nodes that were associated to the deleted scheduler moves back to default scheduler.
    Interface 6: Changes to Queues.
    • Visibility: Public
    • Change Control: Stable
    • Details:
      • Queue_type attribute on the queue will extend itself and accept 3 more values - "execution_prime, execution_non_prime, dedicated".
      • If queue_type is set to "execution_prime", jobs from this queue will be considered only during primetime by the scheduler.
      • If queue_type is set to "execution_non_prime", jobs from this queue will be considered only during non-primetime by scheduler.
      • if queue_type is set to "dedicated", jobs from this queue will be considered only during dedicated time.
      • If queue_type is set to "execution", jobs from this queue will be considered to run irrespective of prime/non-prime time.
    Interface 7: How PBS server runs scheduler.
  • Visibility: Public
  • Change Control: Stable
  • Details:
  • Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
    • If "PBS_START_SCHED" is set to 0 in pbs.conf then server will not start any scheduler.
  • PBS server will connect to these schedulers on their respective hostnames and port number.
  • If server is unable to connect to these schedulers it will check to see if the scheduler is running, try to connect 5 times, and finally restart the scheduler.
  • Scheduling cycles for all configured schedulers are started by PBS server when a job is queued, finished, when scheduling attribute is set to True or when scheduler_iteration is elapsed.
    • When a job gets queued or finished, server will check it's corresponding queue and try to connect to it's corresponding scheduler to run a scheduling cycle.
    • If a scheduler is already running a scheduling cycle while server will just wait for the previous cycle to finish before trying to start another one.
    • If job_accumulation_time is set then server will wait until that time has passed after the submission of a job before starting a new cycle.
  • Each scheduler while querying server specifies it's scheduler name and then gets only a chunk of the universe which is relevant to this scheduler.
    • It gets all the running, queued, exiting jobs from the queues it is associated with.
    • It gets all the list of nodes which are associated with this scheduler and queues managed by the scheduler.
    • It gets the list of all the global policies like run soft/hard limits set on the server object.
    Interface 8: What does not work when multiple scheduler objects are present.
  • Visibility: Public
  • Change Control: Experimental
  • Details:
    When there are multiple scheduler objects configures following things might be broken.
  • Run limits set on server may seem to be broken because a scheduler object may not have a view of whole of the universe.
  • Fairshare is now only limited to what a specific scheduler views, it can not be done complex wide with multiple schedulers.
      • scheduling "
      • Default scheduler's state is by default "idle" since the "scheduling" of server is set to true with default installation. 
  •  
    • The default sched object is the only sched object that cannot be deleted.
    • Trying to set sched_port, sched_priv and sched_host on default scheduler will not be allowed. The following error message is thrown in server_logs when we try to change sched_priv directory.
      • qmgr -c "s sched default sched_priv = /tmp
        Operation is not permitted on default scheduler
    • Trying to start a new scheduler other than the default scheduler, without assigning a partition will throw the following error message in scheduler logs.
               Scheduler does not contain a partition
    • If Scheduler fails to accept new value for its sched_log directory then comment of the corresponding scheduler object at server is updated with the following message.  Also, the scheduling attribute is set to false.
              Unable to change the sched_log directory

    • If Scheduler fails to accept new value for its sched_priv directory then comment of the corresponding scheduler object at server is updated with the following message.  Also, the scheduling attribute is set to false.
             Unable to change the sched_priv directory

    • If PBS validation checks for new value of sched_priv directory do not pass then comment of the corresponding scheduler object at server is updated with the following message. Also, the scheduling attribute is set to false.
            PBS failed validation checks for sched_priv directory
    • If Scheduler is successful in accepting the new log_dir configured at qmgr then the following error message is thrown in the scheduler logs.

             Scheduler log directory is changed to <value of path of the log directory>

    • If Scheduler is successful in accepting the new sched_priv configured at qmgr then the following error message is thrown in the scheduler logs.
             Scheduler priv directory is changed to <value of path of the sched_priv directory>
       

    • If an admin unset partition from a scheduler then this scheduler is identical to default scheduler in which case PBS scheduler will shutdown itself and following error message is logged in scheduler logs.
             Scheduler does not contain a partition.   

    • If Scheduler fails in getting its stats from Server then the following error message is shown in scheduler logs.
             Unable to retrieve the scheduler attributes from server     

    • A new option -I is introduced to provide a name to a scheduler. If we run pbs_sched without this option then it is considered as default scheduler whose name is "default".
             Example: pbs_sched -I sc1 -S 15051
             Here scheduler is started on port number 15051 whose id/name is "sc1".

                                                                                                                                                                                                                                  


Interface 3: Removed


Interface 4: Changes to PBS server.

  • Visibility: Public
    Change Control: StableDetails:
    • PBS does not allow attributes like scheduling, scheduler_iteration to be set on PBS server object.
    • scheduling and scheduler_iteration now belong to the sched object
      • During failover when secondary server takes control it will try to connect to connect to schedulers by using their sched_host attribute.
        • If secondary server is unable to connect to scheduler running on remote host then it will start that scheduler locally and update it's "sched_host" attribute.
        • When Primary pbs server takes control from secondary it will always check if scheduler's sched_host attribute matches it's server name, if it doesn't then it will shutdown the remote scheduler and spawn it locally on primary server.
      • If set at the server level, the changes will be applied to the default sched object
    • As backward compatibility PBS still allows attributes like scheduling, scheduler_iteration to be set on PBS server object. Any changes made to these attributes are automatically reflected in default scheduler. Similarly if any changes are made to these attributes in default scheduler, they are automatically reflected in the server object.
    • If at any point in time if Server is not able to contact or reach the corresponding scheduler one of the following messages are shown in server_logs.
             Unable to reach scheduler associated with partition [<partition id>]
             Unable to reach scheduler associated with job <job id>


Interface 5: Changes to PBS Nodes objects.

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Node object in PBS will have an additional attribute called "partition" which can be used to associate a node to a particular partition.
      • This attribute will be of type string and it will be settable only by Manager/operator and viewable by all users.
    • If "partition" attribute is not set, node will not belong to any partition and default scheduler will schedule jobs on this node.
    • PBS admin/manager can set node's partition attribute to an existing partition name and it's corresponding scheduler will be scheduling jobs on this node.
    • If nodes are associated to a partition then they can not be linked to any queue which isn't part of that partition. Trying to set a node to a queue which isn't part of it's partition will result into the following error:
      • Qmgr: s n node1 queue=workq1
        qmgr obj=stblr3 svr=default: Partition p1 is not part of queue for node
        qmgr: Error (15220) returned from server
    • If a node is associated to a queue then trying to set a partition on this node which does not belong to the same queue will result into the following error.
      • Qmgr: s n stblr3 partition=p2
        qmgr obj=stblr3 svr=default: Queue q1 is not part of partition for node
        qmgr: Error (15219) returned from server
    • If a queue is associated to one or multiple nodes then trying to change partition of this queue to a value other than those that are set on these nodes will result into the following error.
      • Qmgr: s q q1 partition=p2
        qmgr obj=q1 svr=default: Invalid partition in queue
        qmgr: Error (15221) returned from server
    • Nodes with a partition ID (but no queue statement) can run jobs from any of the queues assigned to the same partition (depending upon resource constraints).


Interface 6: Changes to Queues.

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Queue will have a new queue attribute named "partition" which can be used to associate a queue to a particular partition.
      • This attribute will be of type string and it will be settable only by admin/manager and viewable by all users.
      • If "partition" attribute is not set to anything, queue will not belong to any partition and the default scheduler will schedule jobs from this queue.
      • Setting "partition" attribute on routing queues is not allowed. Trying to set the same will throw the following error.
      • Qmgr: s q q4 partition=p1
        qmgr obj=q4 svr=default: Cannot assign a partition to route queue
        qmgr: Error (15217) returned from server

      • Execution queue can not be changed to routing queue if "partition" attribute is set on it. Trying to set it will throw the following error.
        Qmgr: s q q1 queue_type=route
        qmgr obj=q1 svr=default: Route queues are incompatible with the partition attribute queue_type
        qmgr: Error (15218) returned from server



Interface 7: How PBS server runs scheduler.

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
      • "PBS_START_SCHED" pbs.conf variable is now deprecated and it's value will get overridden by schedulers "scheduling" attribute.
    • PBS server will connect to these schedulers on their respective host names and port number.
    • Scheduling cycles for all configured schedulers are started by PBS server when a job is queued, finished, when scheduling attribute is set to True or when scheduler_iteration is elapsed.
      • When a job gets queued or finished, server will check it's corresponding queue and try to connect to it's corresponding scheduler to run a scheduling cycle.
      • If a scheduler is already running a scheduling cycle while server will just wait for the previous cycle to finish before trying to start another one.
      • If job_accumulation_time is set then server will wait until that time has passed after the submission of a job before starting a new cycle.
    • Each scheduler queries whole universe of all schedulers, server, queues, nodes information(This is to avoid IFL changes) etc. from server. Thereafter it does the following.
      • It filters all the running, queued, exiting jobs from the queues it is associated with its partition/s.
      • It filters all the list of nodes which are associated with the partition/s managed by the scheduler.
      • It filters the list of all the global policies like run soft/hard limits set on the server object.
    • PBS's init script will now be reporting status of pbs server only. Schedulers will be managed by server and their status can be fetched using a qmgr command.
      • When pbs_server daemon is stopped using "qterm -s" then, it will also stop all the running scheduler processes.
      • pbs init script while shutting down pbs_server will use the "-s" option to qterm so that all schedulers also come down along with server.


Interface 8: Changes to Reservations

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • In a Multi-sched environment, reservations can be confirmed by any scheduler servicing their respective partitions.
    • After the reservations are confirmed they are assigned the partition their node solution came from.
    •  Once the reservation is confirmed, it has a partition attribute set on it to identify where it was confirmed. Similarly, the reservation queue also gets a partition attribute set on it (matching the reservation).
      • Example:

        % pbs_rsub -lselect=1:ncpus=2 -R 1030 -D1200 -I 5
        R865.centos CONFIRMED

        % pbs_rstat -f R865
        Resv ID: R865.centos
        Reserve_Name = NULL
        Reserve_Owner = root@centos
        reserve_type = 2
        reserve_state = RESV_CONFIRMED
        reserve_substate = 2
        reserve_start = Mon Feb 03 10:30:00 2020
        reserve_end = Mon Feb 03 10:50:00 2020
        reserve_duration = 1200
        queue = R865
        Resource_List.ncpus = 2
        Resource_List.nodect = 1
        Resource_List.select = 1:ncpus=2
        Resource_List.place = free
        Resource_List.walltime = 00:20:00
        schedselect = 1:ncpus=2
        resv_nodes = (vnode2:ncpus=2)
        Authorized_Users = root@centos
        server = centos
        ctime = Mon Feb 03 10:08:55 2020
        mtime = Mon Feb 03 10:09:03 2020
        interactive = 5
        Variable_List = PBS_O_LOGNAME=root,PBS_O_HOST=centos,PBS_O_MAIL=/var/spool/mail/arung,PBS_TZID=America/Los_Angeles
        euser = root
        egroup = root
        partition = P1

        % qmgr -c "l q R865"
        Queue R865
        queue_type = Execution
        total_jobs = 0
        state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:0 Exiting:0 Begun:0
        acl_user_enable = True
        acl_users = root@centos
        resources_max.ncpus = 2
        resources_available.ncpus = 2
        enabled = True
        started = False
        partition = P1

    • Once a reservation is confirmed and partition is assigned to it, it can not be re-confirmed or altered in any other partition.

    • Reservations (and their queues) confirmed by the default scheduler are marked under partition "pbs-default".

    • PBS assigns a default partition name "pbs-default" to all the reservations (and their queues) confirmed by the default scheduler. If an admin tries to assign a scheduler/queue/node partition name "pbs-default", qmgr command throws error - "Default partition name is not allowed".

Interface 9: Deleted

Interface 10: Fairshare

  • Visibility: Public
  • Change Control: Stable
  • Unless there is only a single scheduler, the fairshare scheduling policy per whole PBS complex is no longer supported.
    • This policy is limited to each individual scheduler.
  • The pbsfs command will now act on a single scheduler's fairshare usage database.
    • The new '-I' option allows the admin to specify which scheduler
      • If no '-I' option is given, pbsfs will act upon the default scheduler
      • pbsfs will now contact the server to query the location of the sched_priv for the scheduler.
        • Since contacting the server is now required, the server needs to be running to use pbsfs.  This was not true before.
      • If the scheduler's sched_priv is not accessible, the existing error message will be printed to stderr
        • Unable to access fairshare data

      • If no such scheduler exists, the following message will be printed to stderr
        • Scheduler <sched> does not exist
      • If a scheduler does not have its sched_priv set, the following message will be printed to stderr
        • Scheduler <sched> does not have its sched_priv set
      • Example:
        • pbsfs -s user1 10
          • sets user1's usage to 10 for the default scheduler
        • pbsfs -I sched2 -s user2 10
          • sets user2's usage to 10 for sched2.

  • Notes: 


  1. What is not supported when multiple scheduler objects are present.
  • With the introduction of this feature following things are not supported.
    • Run limits set on server are not supported because a scheduler object does not have a view of  the whole PBS universe.

       2. Server's backfill_depth will be default value for all the schedulers in the complex.

            Ex: Default server's backfill_depth is 1 ,1 job per each scheduler will be backfilled 

                If server's backfill_depth is set to 5 , 5 jobs from each scheduler will get backfilled

3. The pbs_statsched() IFL will return the status of all PBS scheduler status. return type is pointer to list of batch_status structure (one for each scheduler)

4. PTL framework changes :

    Multiple scheduler information can be accessed by self.server.schedulers ,

    All scheduler functions can be called from a specific scheduler as self.server.schedulers['sched_name'].<method name>.

    There is a short hand created to server.schedulers as scheds, We can use this as self.scheds['sched_name'].<method name>.