Equivalence classes are a way to group similar jobs together. Once one job in a class can not run, the scheduler knows the rest of the jobs in that class can not run. This allows the scheduler to be more efficient by not having to consider all the jobs in the system.
Similarity is defined by the values of the following attributes and resources. If two jobs have equal values of all the attributes and resources in use, then they are in the same equivalence class. An attribute or resource is used based on the situation described.
euser: If there are any user limits(soft or hard)
egroup: If there are any group limits(soft or hard)
project: If there are any project limits(soft or hard)
queue: If the job is in a queue
- with limits (hard or soft)
- with nodes associated to it
- which is a prime time queue
- which is a nonprime time queue
- which is a dedicated time queue
All resources in the sched_config resources line in the select statement
All resources in the sched_config resources line from Resource_List (qsub -l)
Time based resources: walltime, cput, max_walltime, and min_walltime from Resource_List
If preempt_targets_enable is true, Resource_List.preempt_targets
The place statement
How equivalence classes work:
- The scheduler starts considering jobs in sorted order
- When a job can't run, we mark the equivalence class as can't run. We stash the reason the job can't run.
- In the future when we consider a job from this class, we already know it can't run. We use the stashed reason.
There are no public external interface changes for this feature. The only outward sign of this feature working is a faster scheduling cycle. There is one PBS private log message added for testing purposes only.
Private Interface #1
Change Control: PBS Private/Contractual(QA)
Visibility: Scheduler log message at DEBUG3
Description: "Number of job equivalence classes: N" where N is the number of job equivalence classes