Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Change Control: Public/Stable
  • Summary: print arbitrary usage in pbsfs output
  • Details:
    • When using pbsfs -g, the arbitrary usage is printed

Interface 4: pbsfs and sending the scheduler a HUP

  • Change Control: Public/Stable
  • Summary: HUPing the scheduler will cause the scheduler to reread the usage file
  • Details:
    • Current Behavior:
      • On a HUP, the scheduler will first write out its view of the fairshare usage before rereading it.  This used to be needed when the scheduler didn't write out its fairshare usage every cycle
      • If an admin makes a change via the pbsfs command, the change will be overwritten on the next scheduling cycle
      • The only way to make a change via pbsfs is to first kill the scheduler, make the change, and then restart the scheduler
    • Changed Behavior:
      • On a HUP, the scheduler will reread the usage (it won't write it out first).  If no one has modified the usage since the last scheduling cycle, the scheduler will reread the same data
      • If an admin makes a change via pbsfs, they can send the scheduler a HUP and the scheduler will see the changed usage
      • Possible race condition:
        • The admin could make a change and a cycle could overwrite the change before the they have a chance to HUP the scheduler
        • How to fix this race condition
          • Stop scheduling(i.e., qmgr -c 'set server scheduling=False') 
          • wait for the current scheduling cycle to finish
          • Make changes to the fairshare usage via the pbsfs command
          • HUP the scheduler
          • Start scheduling 

How fairshare works:

Fairshare is a tree.  There are fairshare groups and entities (e.g., users).  A job belongs to one entity.  Each fairshare group has children which can be either another fairshare group or an entity.  All entities don't have to be at the same level.  For example, root can have 3 children which are two groups and one entity.  The two groups will have entities within them.  Each group or entity has a number of shares assigned to them.  This is a relative percentage between the siblings of that group.  Each entity has a fairshare target (fairshare_perc).  The relative percentage between siblings times the parent's target is the entity/group's target.  The shares are turned into a fairshare target which is a percentage number between 0 and 1.  If you add up all the targets of all of the entities, you will reach 100%.

...