Use case:
As clusters get larger and workloads vary it is becoming critical that the jobs get evaluated in as short as time possible to ensure that the correct workload is being run. Using multiple schedulers to address this issue can allow for different scheduling policies and quicker turnaround time for large number of jobs or nodes.
Gist of design proposal::
PBS scheduler in it's current form can run easily run in multiple instances on the same machine. There are only two major problems that we have to deal with:
Design proposal mentioned below tends to address both these problem.
Interface 1: Extend PBS to support a list of scheduler objects
Below is the list of policies that reside in the policy attribute of scheduler.
Policy name | Type | Default value | example |
---|---|---|---|
round_robin | Boolean | round_robin=False | qmgr -c "s sched sched1 policy.round_robin=True" |
by_queue | Boolean | by_queue=True | qmgr -c "s sched sched1 policy.by_queue=True" |
strict_ordering | Boolean | strict_ordering=False | qmgr -c "s sched sched1 policy.strict_ordering=True" |
help_starving_jobs | Boolean | help_starving_jobs=True | qmgr -c "s sched sched1 policy.help_starving_jobs=True" |
max_starve | string | max_starve="24:00:00" | qmgr -c "s sched sched1 policy.max_starve=24:00:00" |
node_sort_formula | string | node_sort_formula="sort_priority" | qmgr -c "s sched sched1 policy.node_sort_formula="resources_available.ncpus - resources_assigned.ncpus" |
provision_policy | string | provision_policy="aggressive_provision" | qmgr -c "s sched sched1 policy.provision_policy="aggressive_provision" |
exclude_resources | array_string | NOT SET BY DEFAULT | qmgr -c 's sched sched1 policy.exclude_resources="vmem, color"' |
load_balancing | Boolean | load_balancing=False | qmgr -c "s sched sched1 policy.load_balancing=True" |
fairshare | Boolean | fairshare=False | qmgr -c "s sched sched1 policy.fairshare=True" |
fairshare_usage_res | string | fairshare_usage_res=cput | qmgr -c "s sched sched1 policy.fairshare_usage_res=cput" |
fairshare_entity | string | fairshare_entity=euser | qmgr -c "s sched sched1 policy.fairshare_entity=euser" |
fairshare_decay_time | string | fairshare_decay_time="24:00:00" | qmgr -c "s sched sched1 policy.fairshare_decay_time=24:00:00" |
fairshare_enforce_no_shares | Boolean | fairshare_enforce_no_shares=True | qmgr -c "s sched sched1 policy.fairshare_enforce_no_shared=True" |
preemption | Boolean | preemption=True | qmgr -c " s sched sched1 policy.preemption=True" |
preempt_queue_prio | integer | preempt_queue_prio=150 | qmgr -c "s sched sched1 policy.preempt_queue_prio=190" |
preempt_prio | string | preempt_prio="express_queue, normal_jobs" | qmgr -c 's sched sched1 policy.preempt_prio="starving_jobs, normal_jobs, starving_jobs+fairshare"' |
preempt_order | string | preempt_order="SCR" | qmgr -c 's sched sched1 policy.preempt_order="SCR 70 SC 30"' |
preempt_sort | string | preempt_sort="min_time_since_start" | qmgr -c 's sched sched1 policy.preempt_sort="min_time_since_start"' |
peer_queue | array_string | NOT SET BY DEFAULT | qmgr -c 's sched sched1 policy.peer_queue=" workq workq@svr1" |
server_dyn_res | array_string | NOT SET BY DEFAULT | qmgr -c 's sched sched1 policy.server_dyn_res="mem !/bin/get_mem"' |
dedicated_queues | string | NOT_SET_BY_DEFAULT | qmgr -c 's sched sched1 policy.dedicated_queues="queue1,queue2"' |
log_event | integer | log_event=3328 | qmgr -c "s sched sched1 policy.log_event=255" |
job_sort_formula | string | NOT SET BY DEFAULT | qmgr -c 's sched sched1 policy.job_sort_formula="ncpus*walltime"' |
backfill_depth | integer | Set to 1 by default | qmgr -c 's sched sched1 policy.backfill_depth=1' |