Forum discussion link :http://community.pbspro.org/t/external-design-document-for-pp-824-cray-ramp-rate-limiting/693
Interface design:
{
"power_ramp_rate_enable": "True",
"power_on_off_enable": "False",
"node_idle_limit": "1000",
"min_power_down_delay": "600"
"max_jobs_per_queue_limit": "80"
}
Parameter Name | Default value | Description |
---|---|---|
power_ramp_rate_enable | False | Enabling would make PBS perform ramp rate limiting across the PBS cluster running on a CRAY CLE 6.0 platform. Nodes will be ramped-up and kept at sleep state C1 and for ramp down nodes will be put to sleep state C6. |
power_on_off_enable | False | Enabling would make PBS power on and off nodes on the nodes where node attribute poweroff_eligible is true. |
node_idle_limit | 1800 | How long any node should be left idle before it to be considered for powering down or ramp down. |
min_power_down_delay | 1800 | The time limit before a powered-off node can be considered to be brought up. |
max_jobs_per_queue_limit | 100 | Queue level limit indicating maximum number of queued jobs that are analyzed for power on/ramp-up in each queue. |
max_concurrent_power_limit | 10 | Defines how many nodes can be power on/off or ramped up/down at a time. |
Details: Below listed are the new log and error messages introduced by power ramp limiting feature.
# | Scenario | Log/error message |
---|---|---|
1 | Nodes are being ramped down | In server logs: Job;power_ramp_down;launch: /opt/cray/capmc/default/bin/capmc set_sleep_state_limit --nids 24-25 --limit 4 Job;power_ramp_down;launch: finished Log level: LOG_INFO |
2 | Nodes are being ramped up | In server logs: Job;power_ramp_up;launch: /opt/cray/capmc/default/bin/capmc set_sleep_state_limit --nids 24-25 --limit 0 Job;power_ramp_up;launch: finished Log level: LOG_INFO |
3 | Server periodic hook output | In server logs: power_ramp_limit: nodes to ramp up: <node_list> power_ramp_limit: nodes to ramp down: <node_list> Log level: LOG_INFO |
4 | Nodes are being powered off | In server logs: 03/29/2016 02:05:59;0008;Server@sdb;Job;node_power_off;launch: /opt/cray/capmc/default/bin/capmc node_off --nids 24-25 03/29/2016 02:06:01;0008;Server@sdb;Job;node_power_off;launch: finished Log level: LOG_INFO |
5 | Nodes are being powered on | In server logs: 03/29/2016 02:05:59;0008;Server@sdb;Job;node_power_on;launch: /opt/cray/capmc/default/bin/capmc node_on --nids 24-25 03/29/2016 02:06:01;0008;Server@sdb;Job;node_power_on;launch: finished Log level: LOG_INFO |