Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Change Control: Public/Stable
  • Permissions: The job_sort_formula must be set by the admin (e.g., root).
  • Summary: A number between 0 and 1 representing the job's entities usage modified to take it's location in the fairshare tree into account.
  • Details:
    • A number between 0 and 1.  Higher numbers are less deserving.
    • This number is somewhat arbitrary.  It's based on the actual usage of the entity and its location in the fairshare tree, but isn't directly comparable with any other entity's fairshare_tree_usage.
    • The below formula refers to this keyword as effective usage.
    • NOTE: the formula is calculated once at the start of the cycle.  This factor will not be updated during the cycle (e.g., after jobs run).

Interface 2: job_sort_formula keyword "fairshare_factor"

  • Change Control: Public/Stable
  • Permissions: The job_sort_formula must be set by the admin (e.g., root)
  • Summary: Number that allows two entities to be directly compared
  • Details:
    • A number between 0 and 1.  Lower numbers are less deserving
    • This number allows two entities anywhere in the tree hierarchy to be compared.
    • Low usage entities from high usage groups are negatively affected by their siblings.
    • If any job's entity has 0 shares, this keyword will resolve to 0.
    • See below for calculations 

Interface 3: renaming job_sort_formula keyword: fair_share_perc → fairshare_perc

  • Change Control: Public/Stable
  • Summary: The job_sort_formula keyword fair_share_perc is deprecated and replaced with fairshare_perc
  • Details:
    • The renaming is being done to be better aligned with the new fairshare keyword and other fairshare keywords in the sched_config file

Interface

...

4: Changes to pbsfs output

  • Change Control: Public/Stable
  • Summary: print arbitrary effective usage in pbsfs output
  • Details:
    • When using pbsfs -g, the arbitrary usage is printed

Interface 4: pbsfs and sending the scheduler a HUP

  • Change Control: Public/Stable
  • Summary: HUPing the scheduler will cause the scheduler to reread the usage file
  • Details:
    • Current Behavior:
      • On a HUP, the scheduler will first write out its view of the fairshare usage before rereading it.  This used to be needed when the scheduler didn't write out its fairshare usage every cycle
      • If an admin makes a change via the pbsfs command, the change will be overwritten on the next scheduling cycle
      • The only way to make a change via pbsfs is to first kill the scheduler, make the change, and then restart the scheduler
    • Changed Behavior:
      • On a HUP, the scheduler will reread the usage (it won't write it out first).  If no one has modified the usage since the last scheduling cycle, the scheduler will reread the same data
      • If an admin makes a change via pbsfs, they can send the scheduler a HUP and the scheduler will see the changed usage
      Possible race condition:
    • The admin could make a change and a cycle could overwrite the change before the they have a chance to HUP the scheduler
    • How to fix this race condition
    • Stop scheduling(i.e., qmgr -c 'set server scheduling=False') 
    • wait for the current scheduling cycle to finish
    • Make changes to the fairshare usage via the pbsfs command
    • HUP the scheduler
    • Start scheduling effective usage is printed

How fairshare works:

Fairshare is a tree.  There are fairshare groups and entities (e.g., users).  A job belongs to one entity.  Each fairshare group has children which can be either another fairshare group or an entity.  All entities don't have to be at the same level.  For example, root can have 3 children which are two groups and one entity.  The two groups will have entities within them.  Each group or entity has a number of shares assigned to them.  This is a relative percentage between the siblings of that group.  Each entity has a fairshare target (fairshare_perc).  The relative percentage between siblings times the parent's target is the entity/group's target.  The shares are turned into a fairshare target which is a percentage number between 0 and 1.  If you add up all the targets of all of the entities, you will reach 100%.

...

This is fixed by creating an arbitrary effective usage number which is based on the actual usage and some of the parent's arbitrary effective usage.   Even if an entity has zero usage, it gets some of its parent's usage.  It will still be negatively affected by its siblings.

The arbitrary effective usage formula:

entity's actual usage + (parent's arbitrary effective usage - entity's actual usage) * entity's relative percentage between siblings

...

entity's actual usage: a percentage number of the complex's usage: actual usage number / root's actual all usage.

parent's arbitrary effective usage: the above formula applied to the parent

entity's relative percentage between siblings: entity's shares / sum of shares of all the children of the parent (i.e., its siblings)


Since the arbitrary effective usage of the parent is used, this is recursively applied up the tree.  The entity is negatively affected by its siblings.  The parent is negatively affected by its siblings and so forth up the tree.  The arbitrary effective usage calculations start at the level below root's children.  Root's children use their actual usage.

Something to note: summing all of the arbitrary effective usages of all of the entities will be more than 100%.  This doesn't allow for direct comparison between the entities.


The effective usage keyword in the formula is 'fairshare_tree_usage'


Here is a formula to provide a direct comparison between entities.  It is not the only one, but it will work well.  It results in a number between 0 and 1.  A result of .5 means the entity is on fairshare_perctarget.

2^-(fairshare_tree_usage / entity's fairshare_perc)

This finally allows for a direct comparison between entities, and therefore the jobs that belong to those entities.


This is represented in the job_sort_formula aswith the shorthand keyword 'fairshare_factor' or by using formula math: pow(2, -(fairshare_tree_usage/fairshare_perc))

or in the qmgr commandThere is extra quoting which is required to use formula math.  Here is how it is done: qmgr -c 'set server job_sort_formula="pow(2, -(fairshare_tree_usage/fairshare_perc))"'


This finally allows for a direct comparison between entities, and therefore the jobs that belong to those entitiesPlease note that this formula divides by fairshare_perc.  If an entity's shares is set to 0, this will cause a division by zero error.  Please take care when using this formula.


Example:

Share numbers do not need to add up to 100, it just makes the example easier to understand.  Entities don't need to all be at the same level of the tree.  For example, root could own an entity.

...

relative percentage in group: Bob's shares 50 / total of group1's shares 100: .5

arbitrary effective usage: Bob's usage .083 + (parent's usage .1667 - Bob's usage .083) * .5: .125

...

relative percentage in group: Suzy's shares 60 / total of group2's shares 100: .6

arbitrary effective usage: Suzy's usage 0 + (parent's usage: .866 833 - Suzy's usage: 0) * .6: .5

...

Code Block
titlepbsfs -g example
# ./pbsfs -g scott
fairshare entity: scott
Resgroup        				: 11
cresgroup       				: 15
Shares          					: 40
Percentage      				: 24.000000%
Arbitrary usage fairshare_tree_usage	: 0.832973
usage           					: 1000 (cput)
usage/perc      				: 4167
Path from root: 
TREEROOT  :     0       1201 / 1.000 = 1201
group2    :    11       1001 / 0.600 = 1668
scott     :    15       1000 / 0.240 = 4167



Credit: The math for the arbitrary effective usage calculation and the fairshare formula example came from the SLURM fairshare documentation.