Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Address additional comment from @bhroam

...

Parameter NameDefault ValueDescription
cgroup_prefix"pbspro"The parent directory under each cgroup subsystem where job cgroups will be created. For example, if the memory subsystem is located at /sys/fs/cgroup/memory then the memory cgroup for job 123.foo would be found in the /sys/fs/cgroup/memory/pbspro/123.foo directory.
cgroup_lock_file"/var/spool/pbs/mom_priv/cgroups.lock"This file is used to ensure reads and writes of the PBS Professional cgroups are mutually exclusive. The filesystem must support file locking.
exclude_hosts[ ]Specifies the list of hosts for which the cgroups hook should be disabled.
exclude_vntypes[ ]Specifies a list of vnode types for which the cgroups hook should be disabled. This applies to the builtin vntype attribute resource assigned to a node.
kill_timeout10Specifies the number of seconds the cgroup hook should spend while attempting to kill a process within a cgroup.
nvidia-smi/usr/bin/nvidia-smiThe location of the nvidia-smi command on nodes supporting NVIDIA GPU devices.
online_offlined_nodesfalseWhen the cgroup hook fails to kill all processes within a cgroup, it will offline the node to prevent oversubscribing resources. The cgroup hook will periodically attempt to cleanup these "orphaned" cgroups. When set to false, the administrator must manually online the node when the problem is resolved. When set to true, the hook will return the node to service automatically.
periodic_resc_updatefalseWhen set to true, the hook periodically polls the cgroups of a running job and updates the jobs resource usage for cput, mem, and vmem resources. When set to false, MoM periodically polls /proc to obtain resource usage data.
placement_type"load_balanced"When this parameter is set to "load_balanced" the cgroup hook will reorder the sockets of a multi-socket node in an effort to distribute load across them. Sockets with the fewest jobs assigned to them will be allocated first. When any value other than "load_balanced" is specified the sockets are allocated in their assigned numeric order.
run_only_on_hosts[ ]Specifies the list of hosts for which the cgroup hook should be enabled. If the list is not empty, it overrides the settings of exclude_hosts and exclude_vntypes.
use_hyperthreadsfalseWhen set to true, hyperthreads are treated as though they were physical cores. When false, hyperthreads are not counted as physical cores and are not added to the cpuset created for the job on each node, even when hyperthreading is enabled for the CPU.
vnode_per_numa_nodefalseWhen set to true, each NUMA node will appear as though it were an independent vnode managed by a parent vnode. The parent vnode will have no resources associated with it. The vnode names will consist of the node name followed by an ordinal in brackets (e.g. foo[0], foo[1], foo[2], etc.). When set to false, the node will appear as a single vnode.

...

Parameter NameDefault ValueDescription
enabledfalseWhen set to true, the hook will update job CPU time using the value from the cpuacct subsystem (e.g. /sys/fs/cgroup/cpuacct/pbspro/123.foo/cpuacct.usage). When set to false, CPU time is accumulated when MoM periodically polls the processes of the job.
exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.

cpuset Subsystem:

...

Parameter NameDefault ValueDescription
enabledfalse

When set to true, the hook will create a cpuset for each job. The hook will configure the cpuset based on the resources requested by the job, taking into account the number of CPUs and memory requirements. This helps to ensure the job uses memory that is local to the CPUs assigned to the job. When set to false, the kernel is free to schedule processes and allocate memory based on the system configured policies.

NOTE: When running the SGI cpuset Mom this subsystem should be disabled to prevent interference with MoM's cpuset functionality. The administrator may also choose to run the standard Mom and enable this subsystem.

exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]

Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.

devices Subsystem:

...

Parameter NameDefault ValueDescription
enabledfalseWhen set to true, the hook will configure the devices subsystem based on the number of nmics and ngpus requested by the job. Refer to the allow parameter below for additional information. When set to false, no cgroup will be created for the device subsystem.
exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]

Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.

allow[ ]

Specifies how access to devices will be controlled. The list consists of entries in one of the following formats:

  • A string entry will be used verbatim. For example, "b *:* rwm" allows full access (read, write, and mknod) to all block devices and "c *:* rwm" allows full access to all character devices.

  • A list containing two strings. For example, ["mic/scif","rwm"] will look for the major and minor number of the mic/scif device and allow full access. If /dev/mic reported "crw-rw-rw- 1 root root 244, 1 Mar 30 14:50 scif" then the line added to the allow file would look like "c 244:1 rwm"
  • A list containing three strings. For example, ["nvidiactl","rwm", "*"] will look for the major number of the nvidiactl device and allow full access. If /dev/nvidiactl reported "crw-rw-rw- 1 root root 284, 1 Mar 30 14:50 nvidiactl" then the line added to the allow file would look like "c 284:* rwm"

...

Parameter NameDefault ValueDescription
enabledfalseWhen set to true, the hook will register a limit that restricts the amount of huge page memory processes may access. When set to false, no limit is registered.
exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.
default0MBThe amount of huge page memory assigned to the cgroup when the job does not request hpmem.
reserve_percent0The percentage of available huge page memory that is not to be assigned to jobs. This will alter the amount of hpmem that MoM reports to the server. This value is then added to reserve_amount to obtain the total amount reserved.
reserve_amount0MBThe amount of available huge page memory that is not to be assigned to jobs. This will alter the amount of hpmem that MoM reports to the server. This is added to reserve_percent to obtain the total amount reserved.

...

Parameter NameDefault ValueDescription
enabledfalseThe hook will register the physical memory limit for a job when set to true. No limit is registered when set to false.
exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.
soft_limitfalseA soft memory limit is used to specify the minimum amount of physical memory a job should be allocated before utilizing swap space. This adjusts the behavior of the kernel by allowing the physical memory allocation to exceed the amount specified in the soft limit when memory demand (a.k.a memory pressure) is low. The cgroup is ultimately limited to the amount of virtual memory specified in the memsw system. When memory pressure increases, the kernel will begin to page physical memory out to swap space until the soft limit is reached. Soft memory limits allow processes to take advantage of physical memory when it is available, but may lead to longer run times when memory pressure is high. Soft memory limits are used when this parameter is set to true. When set to false, hard memory limits are used that prevent the processes from ever exceeding their specified mem limit.
default0MBThe amount of physical memory available to a cgroup when no mem limit has been specified.
reserve_percent0The percentage of available physical memory that is not to be assigned to jobs. This will alter the amount of mem that MoM reports to the server. This value is then added to reserve_amount to obtain the total amount reserved.
reserve_amount0MBThe amount of available physical memory that is not to be assigned to jobs. This will alter the amount of mem that MoM reports to the server. This is added to reserve_percent to obtain the total amount reserved.

...

Parameter NameDefault ValueDescription
enabledfalseThe hook will register the virtual memory limit for a job when set to true. No limit is registered when set to false.
exclude_hosts[ ]Specifies the list of hosts for which the use of this subsystem should be disabled.
exclude_vntypes[ ]Specifies the list of vnode types for which the use of this subsystem should be disabled. This applies to the builtin vntype attribute resource assigned to a node.
default0MBThe amount of virtual memory available to a cgroup when no vmem limit has been specified.
reserve_percent0The percentage of available virtual memory that is not to be assigned to jobs. This will alter the amount of vmem that MoM reports to the server. This value is then added to reserve_amount to obtain the total amount reserved.
reserve_amount0MBThe amount of available virtual memory that is not to be assigned to jobs. This will alter the amount of vmem that MoM reports to the server. This is added to reserve_percent to obtain the total amount reserved.

...