Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagejs
themeConfluence
titleExample Cgroup Configuration File
linenumberstrue
{
    "cgroup_prefix"         : "pbspro",
    "exclude_hosts"         : ["node001", "node002"],
    "exclude_vntypes"       : ["disable_cgroups", "login_node"],
    "run_only_on_hosts"     : [],
    "periodic_resc_update"  : true,
    "vnode_per_numa_node"   : false,
    "online_offlined_nodes" : true,
    "use_hyperthreads"      : false,
    "ncpus_are_cores"       : false,
    "cgroup" : {
        "cpuacct" : {
            "enabled"         : true,
            "exclude_hosts"   : ["node003"],
            "exclude_vntypes" : ["red_node"]
        },
        "cpuset" : {
            "enabled"         : true,
            "exclude_hosts"   : ["node004"],
            "exclude_vntypes" : ["green_node"]
        },
        "devices" : {
            "enabled"         : false,
            "exclude_hosts"   : [],
            "exclude_vntypes" : [],
            "allow"           : [
                "b *:* rwm",
                "c *:* rwm",
                ["mic/scif", "rwm"],
                ["nvidiactl", "rwm", "*"],
                ["nvidia-uvm", "rwm"]
            ]
        },
        "hugetlb" : {
            "enabled"         : false,
            "exclude_hosts"   : [],
            "exclude_vntypes" : [],
            "default"         : "0MB",
            "reserve_percent" : "0",
            "reserve_amount"  : "0MB"
        },
        "memory" : {
            "enabled"         : true,
            "exclude_hosts"   : [],
            "exclude_vntypes" : ["blue_node"],
            "soft_limit"      : false,
            "default"         : "256MB",
            "reserve_percent" : "0",
            "reserve_amount"  : "1GB"
        },
        "memsw" : {
            "enabled"         : true,
            "exclude_hosts"   : [],
            "exclude_vntypes" : ["grey_node"],
            "default"         : "256MB",
            "reserve_percent" : "0",
            "reserve_amount"  : "1GB"
        }
    }
}

...

Parameter NameDefault ValueDescription
cgroup_prefix"pbspro"The parent directory under each cgroup subsystem where job cgroups will be created. For example, if the memory subsystem is located at /sys/fs/cgroup/memory then the memory cgroup for job 123.foo would be found in the /sys/fs/cgroup/memory/pbspro/123.foo directory.
cgroup_lock_file"/var/spool/pbs/mom_priv/cgroups.lock"This file is used to ensure reads and writes of the PBS Professional cgroups are mutually exclusive. The filesystem must support file locking.
exclude_hosts[ ]Specifies the list of hosts for which the cgroups hook should be disabled.
exclude_vntypes[ ]Specifies a list of vnode types for which the cgroups hook should be disabled. This applies to the builtin vntype resource assigned to a node.
kill_timeout10Specifies the number of seconds the cgroup hook should spend while attempting to kill a process within a cgroup.
ncpus_are_coresfalseDo not include hyperthreads when calculating the ncpus value MoM reports to the server.
nvidia-smi/usr/bin/nvidia-smiThe location of the nvidia-smi command on nodes supporting NVIDIA GPU devices.
online_offlined_nodesfalseWhen the cgroup hook fails to kill all processes within a cgroup, it will offline the node to prevent oversubscribing resources. The cgroup hook will periodically attempt to cleanup these "orphaned" cgroups. When set to false, the administrator must manually online the node when the problem is resolved. When set to true, the hook will return the node to service automatically.
periodic_resc_updatefalseWhen set to true, the hook periodically polls the cgroups of a running job and updates the jobs resource usage for cput, mem, and vmem resources. When set to false, MoM periodically polls /proc to obtain resource usage data.
placement_type"load_balanced"When this parameter is set to "load_balanced" the cgroup hook will reorder the sockets of a multi-socket node in an effort to distribute load across them. Sockets with the fewest jobs assigned to them will be allocated first. When any value other than "load_balanced" is specified the sockets are allocated in their assigned numeric order.
run_only_on_hosts[ ]Specifies the list of hosts for which the cgroup hook should be enabled. If the list is not empty, it overrides the settings of exclude_hosts and exclude_vntypes.
use_hyperthreadsfalseWhen set to true, hyperthreads are treated as though they were physical cores. When false, hyperthreads are not counted as physical cores and are not added to the cpuset created for the job on each node, even when hyperthreading is enabled for the CPU.
vnode_per_numa_nodefalseWhen set to true, each NUMA node will appear as though it were an independent vnode managed by a parent vnode. The parent vnode will have no resources associated with it. The vnode names will consist of the node name followed by an ordinal in brackets (e.g. foo[0], foo[1], foo[2], etc.). When set to false, the node will appear as a single vnode.

...