Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »


Equivalence classes are a way to group similar jobs together.  Once one job in a class can not run, the scheduler knows the rest of the jobs in that class can not run.  This allows the scheduler to be more efficient by not having to consider all the jobs in the system.


Similar is defined by the following attributes and resources:

euser: If there are any user limits(soft or hard)

egroup: If there are any group limits(soft or hard)

project: If there are any project limits(soft or hard)

queue: If the job is in a queue

  • with limits (hard or soft)
  • with nodes associated to it
  • which is a prime time queue
  • which is a nonprime time queue
  • which is a dedicated time queue

All resources in the sched_config resources line in the select statement

All resources in the sched_config resources line from Resource_List (qsub -l)

The place statement


How equivalence classes work:

  1. The scheduler sorts jobs into priority order.  This may include sorting algorithms such as the job_sort_formula or fairshare. 
  2. The scheduler starts considering jobs in sorted order
  3. If calendaring is enabled, the first N (backfill_depth=N) jobs that can not run will be added to the calendar.
  4. After calendering has finished, equivalence classes come into play.  Any time a job can not run, the rest of the jobs in its class are not considered during the cycle.


The external behavior of this feature is seen in the following way in the scheduler logs:

Old:

...;Job Id;Considering job to run
...;Job Id;<Reason job can not run>
...;Job Id;Considering job to run
...

i.e. each job gets its own "Considering job to run line"

Example:

04/15/2015 16:01:18;1234.mars;Considering job to run
04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1235.mars;Considering job to run

New:

...;Job Id;Considering job to run
...;Job Id;<Reason job can not run>
<same line for rest of equivalence class>

Example:

04/15/2015 16:01:18;1234.mars;Considering job to run
04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1235.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1236.mars;Insufficient amount of resource: ncpus
04/15/2015 16:01:18;1237.mars;Insufficient amount of resource: ncpus

Note: Prior to this feature the order in which the 'Considering job to run' lines appeared in the scheduler log would be the political ordering of the jobs.  It will no longer be possible to determine the political ordering of the jobs.  The order of the 'Considering job to run' lines will be the political ordering of the equivalence classes.

  • No labels