Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Visibility: Public
    Change Control: StableDetails:
    • PBS does not allow attributes like scheduling, scheduler_iteration to be set on PBS server object.
    • scheduling and scheduler_iteration now belong to the sched object
      • During failover when secondary server takes control it will try to connect to connect to schedulers by using their host attribute.
        • If secondary server is unable to connect to scheduler running on remote host then it will start that scheduler locally and update it's "host" attribute.
        • When Primary pbs server takes control from secondary it will always check if scheduler's host attribute matches it's server name, if it doesn't then it will shutdown the remote scheduler and spawn it locally on primary server.
      • If set at the server level, the changes will be applied to the default sched object
    • As backward compatibility PBS still allows attributes like scheduling, scheduler_iteration to be set on PBS server object. Any changes made to these attributes are automatically reflected in scheduler object. Similarly if any changes are made to these attributes in scheduler object, they are automatically reflected in the server object.
    • If at any point in time if Server is not able to contact or reach the corresponding scheduler one of the following messages are shown in server_logs.
             Unable to reach scheduler associated with partition
             Unable to reach scheduler associated with job <job id>


Interface 5: Changes to PBS Nodes objects.

...

  • Visibility: Public
  • Change Control: Stable
  • Details:
    • Upon startup PBS server will start all schedulers which have their scheduling attribute set to "True"
      • "PBS_START_SCHED" pbs.conf variable is now deprecated and it's value will get overridden by schedulers "scheduling" attribute.
    • PBS server will connect to these schedulers on their respective host names and port number.
    • Scheduling cycles for all configured schedulers are started by PBS server when a job is queued, finished, when scheduling attribute is set to True or when scheduler_iteration is elapsed.
      • When a job gets queued or finished, server will check it's corresponding queue and try to connect to it's corresponding scheduler to run a scheduling cycle.
      • If a scheduler is already running a scheduling cycle while server will just wait for the previous cycle to finish before trying to start another one.
      • If job_accumulation_time is set then server will wait until that time has passed after the submission of a job before starting a new cycle.
    • Each scheduler while querying server specifies it's scheduler name and then gets only a chunk of the universe which is relevant to this scheduler.
      • It gets all the running, queued, exiting jobs from the queues it is associated with one of it's partitions.
      • It gets all the list of nodes which are associated with the partition managed by the scheduler.
      • It gets the list of all the global policies like run soft/hard limits set on the server object.
    • PBS's init script will now be reporting status of pbs server only. Schedulers will be managed by server and their status can be fetched using a qmgr command.
      • When pbs_server daemon is stopped using "qterm -s" then, it will also stop all the running scheduler processes.
      • pbs init script while shutting down pbs_server will use the "-s" option to qterm so that all schedulers also come down along with server.
    • If at any point in time if Server is not able to contact or reach the corresponding scheduler one of the following messages are shown in
      • server
    • _logs
      • .

    •        Unable to reach scheduler associated with partition
             Unable to reach scheduler associated with job <job id>


Interface 8: Changes to pbs_rsub command

...