Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. The scheduler sorts jobs into priority order.  This may include sorting algorithms such as the job_sort_formula or fairshare. 
  2. The scheduler starts considering jobs in sorted order
  3. If calendaring is enabled, the first N (backfill_depth=N) jobs that can not run will be added to the calendar.
  4. After calendering has finished, equivalence classes come into play.  Any time When a job can not 't run, the rest of the jobs in its class are not considered during the cycle.

The external behavior of this feature is seen in the following way in the scheduler logs:

Old:

No Format
...;Job Id;Considering job to run
...;Job Id;<Reason job can not run>
...;Job Id;Considering job to run
...

i.e. each job gets its own "Considering job to run line"

Example:

No Format
04/15/2015 16:01:18;1234.mars;Considering job to run
04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1235.mars;Considering job to run

New:

No Format
...;Job Id;Considering job to run
...;Job Id;<Reason job can not run>
<same line for rest of equivalence class>

Example:

No Format
04/15/2015 16:01:18;1234.mars;Considering job to run
04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1235.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1236.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1237.mars;Insufficient amount of resource: ncpus

...

  1. we mark the equivalence class as can't run.  We stash the reason the job can't run.
  2. In the future when we consider a job from this class, we already know it can't run.  We use the stashed reason.


There are no external interface changes for this feature.  The only outward sign of this feature working is a faster scheduling cycle.