PP-718: Add fairshare usage values to the job_sort_formula

Developer's forum post: http://community.pbspro.org/t/pp-718-add-fairshare-usage-values-to-the-job-sort-formula/471/1

Interface 1: job_sort_formula keyword "fairshare_tree_usage"

  • Change Control: Public/Stable
  • Permissions: The job_sort_formula must be set by the admin (e.g., root).
  • Summary: A number between 0 and 1 representing the job's entities usage modified to take it's location in the fairshare tree into account.
  • Details:
    • A number between 0 and 1.  Higher numbers are less deserving.
    • This number is somewhat arbitrary.  It's based on the actual usage of the entity and its location in the fairshare tree, but isn't directly comparable with any other entity's fairshare_tree_usage.
    • The below formula refers to this keyword as effective usage.
    • NOTE: the formula is calculated once at the start of the cycle.  This factor will not be updated during the cycle (e.g., after jobs run).

Interface 2: job_sort_formula keyword "fairshare_factor"

  • Change Control: Public/Stable
  • Permissions: The job_sort_formula must be set by the admin (e.g., root)
  • Summary: Number that allows two entities to be directly compared
  • Details:
    • A number between 0 and 1.  Lower numbers are less deserving
    • This number allows two entities anywhere in the tree hierarchy to be compared.
    • Low usage entities from high usage groups are negatively affected by their siblings.
    • If any job's entity has 0 shares, this keyword will resolve to 0.
    • See below for calculations 

Interface 3: renaming job_sort_formula keyword: fair_share_perc → fairshare_perc

  • Change Control: Public/Stable
  • Summary: The job_sort_formula keyword fair_share_perc is deprecated and replaced with fairshare_perc
  • Details:
    • The renaming is being done to be better aligned with the new fairshare keyword and other fairshare keywords in the sched_config file

Interface 4: Changes to pbsfs output

  • Change Control: Public/Stable
  • Summary: print effective usage in pbsfs output
  • Details:
    • When using pbsfs -g, the effective usage is printed

How fairshare works:

Fairshare is a tree.  There are fairshare groups and entities (e.g., users).  A job belongs to one entity.  Each fairshare group has children which can be either another fairshare group or an entity.  All entities don't have to be at the same level.  For example, root can have 3 children which are two groups and one entity.  The two groups will have entities within them.  Each group or entity has a number of shares assigned to them.  This is a relative percentage between the siblings of that group.  Each entity has a fairshare target (fairshare_perc).  The relative percentage between siblings times the parent's target is the entity/group's target.  The shares are turned into a fairshare target which is a percentage number between 0 and 1.  If you add up all the targets of all of the entities, you will reach 100%.

Usage is accumulated by the entities.  Each time an entity accumulates usage, its parent group accumulates the same amount (and its parent, and so on up the tree).  This means the parent's usage is a sum of all of its children's usage.

The most deserving entity is determined by walking the tree.  At each level in the tree, the most deserving group is chosen, and we descend to its children and continue on until we reach an entity.  The most deserving group/entity is a function of the target and the usage.  What does this mean?  It means that low usage entities in high usage groups will be negatively affected by its sibling entities/groups.


How do we map fairshare into the formula:

The easiest way to map fairshare to the formula would be to take the actual usage and fairshare_perc of each entity and compare them.  This doesn't work because low usage entities of high usage groups will no longer be negatively affected by its siblings.  We effectively flatten the tree.   

This is fixed by creating an effective usage number which is based on the actual usage and some of the parent's effective usage.   Even if an entity has zero usage, it gets some of its parent's usage.  It will still be negatively affected by its siblings.

The effective usage formula:

entity's actual usage + (parent's effective usage - entity's actual usage) * entity's relative percentage between siblings

Here are the factors:

entity's actual usage: a percentage number of the complex's usage: actual usage number / all usage.

parent's effective usage: the above formula applied to the parent

entity's relative percentage between siblings: entity's shares / sum of shares of all the children of the parent (i.e., its siblings)


Since the effective usage of the parent is used, this is recursively applied up the tree.  The entity is negatively affected by its siblings.  The parent is negatively affected by its siblings and so forth up the tree.  The effective usage calculations start at the level below root's children.  Root's children use their actual usage.

Something to note: summing all of the effective usages of all of the entities will be more than 100%.  This doesn't allow for direct comparison between the entities.


The effective usage keyword in the formula is 'fairshare_tree_usage'


Here is a formula to provide a direct comparison between entities.  It is not the only one, but it will work well.  It results in a number between 0 and 1.  A result of .5 means the entity is on target.

2^-(fairshare_tree_usage / entity's fairshare_perc)

This finally allows for a direct comparison between entities, and therefore the jobs that belong to those entities.


This is represented in the job_sort_formula with the shorthand keyword 'fairshare_factor' or by using formula math: pow(2, -(fairshare_tree_usage/fairshare_perc))

There is extra quoting which is required to use formula math.  Here is how it is done: qmgr -c 'set server job_sort_formula="pow(2, -(fairshare_tree_usage/fairshare_perc))"'


Please note that this formula divides by fairshare_perc.  If an entity's shares is set to 0, this will cause a division by zero error.  Please take care when using this formula.


Example:

Share numbers do not need to add up to 100, it just makes the example easier to understand.  Entities don't need to all be at the same level of the tree.  For example, root could own an entity.

Tree:

  • Root fairshare_perc: 1.0 usage: 1200
    • group1 shares: 40 fairshare_perc: .4 actual usage: 200
      • Bob shares 50 fairshare_perc .2 actual usage: 100
      • Cathy shares 50 fairshare_perc .2 actual usage: 100
    • group2: shares: 60 fairshare_perc: .6 actual usage: 1000
      • Suzy shares 60 fairshare_perc .36 actual usage: 0
      • Scott shares 40 fairshare_perc .24 actual usage: 1000


Bob:

actual usage: 100/1200: .083

parent's usage: .1667

relative percentage in group: Bob's shares 50 / total of group1's shares 100: .5

effective usage: Bob's usage .083 + (parent's usage .1667 - Bob's usage .083) * .5: .125

Fairshare formula: 2^-(..125/.2): .648


Suzy:

actual usage: 0/1200: 0

parent's usage: 1000/1200: .833

relative percentage in group: Suzy's shares 60 / total of group2's shares 100: .6

effective usage: Suzy's usage 0 + (parent's usage: .833 - Suzy's usage: 0) * .6: .5

Fairshare formula: 2^-(.5/.36): .382


Even though Suzy had a higher fairshare_perc than Bob and less usage than Bob, her fairshare formula value is quite a bit lower than his.  This is due to the huge amount of usage her group mate used. 


pbsfs example:

pbsfs -g example
# ./pbsfs -g scott
fairshare entity: scott
Resgroup				: 11
cresgroup				: 15
Shares					: 40
Percentage				: 24.000000%
fairshare_tree_usage	: 0.832973
usage					: 1000 (cput)
usage/perc				: 4167
Path from root: 
TREEROOT  :     0       1201 / 1.000 = 1201
group2    :    11       1001 / 0.600 = 1668
scott     :    15       1000 / 0.240 = 4167



Credit: The math for the effective usage calculation and the fairshare formula example came from the SLURM fairshare documentation.