Interface 1: equiv_class_enable scheduler attribute
- Visibility: Private
- Change Control: Unstable
- Permissions: Write: Manager Read: Everyone
- Details: Using this attribute admin can enable the equivalence class capability in the scheduler
- Usage: boolean
- Default: Unset/Enabled
Interface 2: equiv_class_exclude scheduler attribute
- Visibility: Public
- Change Control: Stable
- Permissions: Write: Manager Read: Everyone
- Details: Using this attribute admin can exclude certain resources from being considered when building the equivalence classes. These resources are excluded from both the Resource_list and select resources.
- Usage: comma separated list of resources
- Example: equiv_class_exclude: walltime, software
- Default: Unset
Equivalence classes are a way to group identical jobs together. Once one job in a class can not run, the scheduler knows the rest of the jobs in that class can not run.
How they work:
- The scheduler sorts jobs into priority order. This may include sorting algorithms such as the job_sort_formula or fairshare.
- The scheduler starts considering jobs in sorted order
- If calendaring is enabled, the first N (backfill_depth=N) jobs that can not run will be added to the calendar.
- After calendering has finished, equivalence classes come into play. Any time a job can not run, the rest of the jobs in its class are not considered during the cycle.
There is a new qmgr scheduler object attribute named 'equiv_class_enable' which will switch between the old and new behavior of this feature.
Usage: qmgr> s sched equiv_class_enable: True
Once equivalence classes are enabled, the scheduler will create a set of jobs that are identical. An equivalence class is made up of jobs that have the same euser, egroup, project, select, place, and Resource_List resources. Any undesired resources can be excluded by listing them in the 'equiv_class_exclude' sched attribute. Any resource listed is excluded from both the Resource_List resources and the select resources.
The external behavior of this feature is seen in the following way in the scheduler logs:
Old:
...;Job Id;Considering job to run ...;Job Id;<Reason job can not run> ...;Job Id;Considering job to run ...
i.e. each job gets its own "Considering job to run line"
Example:
04/15/2015 16:01:18;1234.mars;Considering job to run 04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus 04/15/2015 16:01:18;1235.mars;Considering job to run
New:
...;Job Id;Considering job to run ...;Job Id;<Reason job can not run> <same line for rest of equivalence class>
Example:
04/15/2015 16:01:18;1234.mars;Considering job to run 04/15/2015 16:01:18;1234.mars;Insufficient amount of resource: ncpus 04/15/2015 16:01:18;1235.mars;Insufficient amount of resource: ncpus 04/15/2015 16:01:18;1236.mars;Insufficient amount of resource: ncpus 04/15/2015 16:01:18;1237.mars;Insufficient amount of resource: ncpus