Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Gist of design: We can expose a way for each job to pass a limited python expression that can be used to filter out nodes on the basis of their resources_available and resources_assigned present on the nodes. This filter will be used to filter nodes every time out nodes that will be considered to run a job (if submitted with node-_filter) is considered to run or the time scheduler tries to add this job to the calendar.

link to  forum discussion

...

Extend PBS to allow users to submit jobs with a node-filter (

...

node_filter)

...

expression.

...

  • Details:

...

    • Example: qsub -lselect=3:ncpus=2:mem=18gb -lnode_filter=“ncpus<=8 and model==Skylake or ncpus>=8 and model==Haswell” job.scr
    • A user can specify a node filter with each of their jobs and this filter will help scheduler to filter out nodes that this job is allowed to run on.
    • There is a new built-in resource

...

    • “node_filter”. This resource is of type string. Users/operator/manager has privileges to read/write this resource. It is not a host level resource.

...

    • node_filter is an expression created by using resources_available on the nodes.
    • node_filter can be a collection of multiple filters separated by 'or' operator. The scheduler will cull the nodes by applying filters and run the job as soon as it finds the node solution.
    • If there are many filters separated by 'or' operator, the scheduler will start applying filter. After applying each filter scheduler will try to find a node solution for the job, if it is able to find one then it will use it otherwise it will discard the result of the filter and applies the next filter on all the nodes again. The scheduler may choose to apply filters in any order.
    • Users can specify a node filter with node resources using

...

    • operators like "==,<, >, <=, >=, !=

...

    • ,

...

    • and

...

While evaluating, nfilter will be made available a node dictionary which itself consists of two dictionaries - resources_available, resources_assigned. It will look something like this - 

node={'resources_available':{'ncpus':'8','mem':'16777215kb',...},'resources_assigned':{'ncpus':'2', 'mem':'4194304kb',...}}

...

Interface 2: Errors logged in scheduler log file while evaluating "nfilter" resource expression

...

Visibility: Public

...

Change Control: Stable

Details:

If scheduler fails to evaluate the nfilter expression present in the job's resource list it will log following log messages at DEBUG2 log level:

...

    • , or" 
    • While applying the filter, if scheduler encounters a resource in the filter that is not present in the 'resources' line of sched config then the scheduler will ignore that resource while filtering the nodes.
  • Caveats:
    • node_filter and -lplace=group cannot be used together in the same job. Such a job submission will fail with error "node filter cannot be used with placement grouping"