Use case:
As clusters get larger and workloads vary it is becoming critical that the jobs get evaluated in as short as time possible to ensure that the correct workload is being run. Using multiple schedulers to address this issue can allow for different scheduling policies and quicker turnaround time for large number of jobs or nodes.
Gist of design proposal::
PBS scheduler in it's current form can run easily run in multiple instances on the same machine. There are only two major problems that we have to deal with:
Design proposal mentioned below tends to address both these problem.
Interface 1: Extend PBS to support a list of scheduler objects
Below is the list of policies that reside in the policy attribute of scheduler.
Policy name | Type | Default value | example |
---|---|---|---|
round_robin | Boolean | round_robin=False | qmgr -c "s policy p1 round_robin=True" |
by_queue | Boolean | by_queue=True | qmgr -c "s policy p1 by_queue=True" |
strict_ordering | Boolean | strict_ordering=False | qmgr -c "s policy p1 strict_ordering=True" |
help_starving_jobs | Boolean | help_starving_jobs=True | qmgr -c "s policy p1 help_starving_jobs=True" |
max_starve | string | max_starve="24:00:00" | qmgr -c "s policy p1 max_starve=24:00:00" |
node_sort_key | array_string | node_sort_key = "sort_priority HIGH" | qmgr -c 's policy p1 node_sort_key="sort_priority HIGH, ncpus HIGH"' |
provision_policy | string | provision_policy="aggressive_provision" | qmgr -c "s policy p1 provision_policy="aggressive_provision" |
exclude_resources | array_string | NOT SET BY DEFAULT | qmgr -c 's policy p1 exclude_resources="vmem, color"' |
load_balancing | Boolean | load_balancing=False | qmgr -c "s policy p1 load_balancing=True" |
fairshare | Boolean | fairshare=False | qmgr -c "s policy p1 fairshare=True" |
fairshare_usage_res | string | fairshare_usage_res=cput | qmgr -c "s policy p1 fairshare_usage_res=cput" |
fairshare_entity | string | fairshare_entity=euser | qmgr -c "s policy p1 fairshare_entity=euser" |
fairshare_decay_time | string | fairshare_decay_time="24:00:00" | qmgr -c "s policy p1 fairshare_decay_time=24:00:00" |
fairshare_enforce_no_shares | Boolean | fairshare_enforce_no_shares=True | qmgr -c "s policy p1 fairshare_enforce_no_shared=True" |
preemption | Boolean | preemption=True | qmgr -c "s policy p1 preemption=True" |
preempt_queue_prio | integer | preempt_queue_prio=150 | qmgr -c "s policy p1 preempt_queue_prio=190" |
preempt_prio | string | preempt_prio="express_queue, normal_jobs" | qmgr -c 's policy p1 preempt_prio="starving_jobs, normal_jobs, starving_jobs+fairshare"' |
preempt_order | string | preempt_order="SCR" | qmgr -c 's policy p1 preempt_order="SCR 70 SC 30"' |
preempt_sort | string | preempt_sort="min_time_since_start" | qmgr -c 's policy p1 preempt_sort="min_time_since_start"' |
peer_queue | array_string | NOT SET BY DEFAULT | qmgr -c 's policy p1 peer_queue=" workq workq@svr1" |
server_dyn_res | array_string | NOT SET BY DEFAULT | qmgr -c 's policy p1 server_dyn_res="mem !/bin/get_mem"' |
dedicated_queues | array_string | NOT_SET_BY_DEFAULT | qmgr -c 's policy p1 dedicated_queues="queue1,queue2"' |
log_event | integer | log_event=3328 | qmgr -c "s policy p1 log_event=255" |
job_sort_formula | string | NOT SET BY DEFAULT | qmgr -c 's policy p1 job_sort_formula="ncpus*walltime"' |
backfill_depth | integer | Set to 1 by default | qmgr -c 's policy p1 backfill_depth=1' |
job_sort_key | array_string | NOT_SET_BY_DEFAULT | qmgr -c 's policy p1 job_sort_key="ncpus HIGH, mem LOW"' |
prime_spill | string | NOT_SET_BY_DEFAULT | qmgr -c 's policy p1 prime_spill="01:00:00"' |
prime_exempt_anytime_queues | Boolean | prime_exempt_anytime_queues=false | qmgr -c 's policy p1 prime_exempt_anytime_queues=false' |
backfill_prime | Boolean | backfill_prime=false | qmgr -c 's policy p1 backfill_prime=false' |