Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Vishesh's comments.

Forum discussion link :http://community.pbspro.org/t/external-design-document-for-pp-824-cray-ramp-rate-limiting/693

  • Interface 1: Server attribute: power_ramprate_enable
    • Change Control: Stable
    • Synopsispower_ramprate_enable
    • Details: Server attribute. When set to True, PBS can use power ramp rate limiting feature for Cray platform. 
      • Setting this attribute will also enables attributes  node_idle_time and max_ramprate_limit. 
      • Unset of power_ramprate_enable attribute will not unset node_idle_time and max_ramprate_limit.
      • PBS type: Boolean 
      • Default: unset
      • Python type: bool
      • Manager has read and write permission and others have read permission.
      • A pbshook PBS_Power get enabled as well when this attribute is set. 
      • Example:
        qmgr -c "set server power_ramprate_enable = True"
        qmgr -c "set server power_ramprate_enable = 0"
        At hook: s=pbs.server(); print s.power_ramprate_enable
  • Interface 2: New server attribute: node_idle_time
    • Change Control: Stable
    • Synopsis: node_idle_time
    • Details: This new server attribute will define the minimum idle time for nodes to be considered for power ramp down.
      • Enabled when server attribute power_ramprate_enable is set.
      • The default value is set to 1800 seconds.
      • Manager and Operator has set permission. All have read permission.
      • To modify the default value use qmgr:
        • qmgr -c "set server node_idle_time = <new_value>"
        • <new_value> is the time in seconds and should be a non zero positive number.
      • PBS type: long
      • Python type: int
      • Example:
        qmgr -c "set server node_idle_time = 2000"
        At hook: s=pbs.server(); print s.node_idle_time
  • Interface 3: New server attribute: max_ramprate_limit
    • Change Control: Stable
    • Synopsis: max_ramprate_limit
    • Details: This new server attribute will define the set maximum number of nodes that are allowed to drop to C-6 (least possible sleep state).
      • Enabled when server attribute power_ramprate_enable is set.
      • The default value is set to 5.
      • Manager and Operator has set permission. All have read permission..
      • To modify the default value use qmgr:
        • qmgr -c "set server max_ramprate_limit = <new_value>"
        • <new_value>  should be a non zero positive number.
      • PBS type: long
      • Python type: int
      • Example:
        qmgr -c "set server max_ramprate_limit = 20"
        At hook: s=pbs.server(); print s.max_ramprate_limit
  • Interface 4: DELETED
  • Interface 5: New node attribute: last_used_time
    • Change Control: Stable
    • Synopsis: last_used_time
    • Details: This new node attribute will be updated with time stamp at the end of any job or reservation.
      • If node is released early from a running job this timestamp gets updated.
      • Node status command pbsnodes will convert internal date format (seconds since epoch) to human readable format and display the value of this attribute in "MON DD YY HH:MM:SS" format.
      • Attribute will be reset when node is ramped up.
      • Managers and Operators have read permission.
      • For new vnodes this attribute will be updated for the first time with the current timestamp when power_ramprate_enable is set for that particular node.If node attribute power_ramprate_enable is unset for a node previously and set again, current timestamp is updated for last_used_time attributethey are created or when the nodes are rebooted.
      • This attribute can now be used in sched_config as a node_sort_key. This will hep sort the nodes based on their last used time.
      • PBS type: long
      • Python type: int
      • Example:
        • node_sort_key: "last_used_time HIGH"
        • node_sort_key: "last_used_time LOW"
  • Interface 6: New node state: asleep
    • Change Control: Stable
    • Synopsis: asleep
    • Details: This new node state will be set when nodes are ramped down by PBS via power ramp rate limiting. Scheduler will be able to schedule jobs and reservations on these nodes with "asleep" state. If selected by the scheduler, server will ramp these nodes up when required to run jobs or for reservations.
      • A server periodic hook (pbs hook PBS_power provided as part of PBS package) runs every $freq seconds and takes list of vnodes to power ramp down the nodes and marks them in new asleep node state.
      • At most $max_ramprate_limit nodes will be ramped down every $freq seconds.
  • Interface 7: New node state: ramp-up
    • Change Control: Stable
    • Synopsis: ramp-up
    • Details: This new node state will be set when nodes are being ramped up by PBS via power ramp rate limiting.
      • New server side hook power_provisioning will take node in asleep state but are assigned to upcoming jobs or reservations and ramps it up.
      • While nodes are being ramped up through this this hook, node is marked with this new state "ramp-up".
      • This hook interfaces with vendor power api's through generic PMI interface to power ramp up the nodes.
  • Interface 8: New server hook event: power_provision
    • Change control: Stable
    • Synopsis: Server hook event power_provision
    • Details: This is a new server side hook event used for power related provisioning. 
      • Hook will have access to name of vnode to be provisioned. Hook will provision one node at a time.
      • This hook takes names of only those nodes in asleep state but are assigned to upcoming jobs or reservations and ramps them up.
      • For a job or reservation if there are nodes more than max_ramprate_limit to be ramped up, at a time maximum max_ramprate_limit nodes will be ramped up in anticipation of use. Once the nodes are provisioned next nodes in queue to be provisioned are considered.
      • If there are any issues during provisioning such nodes are marked offline.
      • This hook interfaces with vendor power api's through generic PMI interface to power ramp down the nodes.
  • Interface 9: Log/Error messages.
    • Change Control: Stable
    • Synopsis: New log/error messages.
    • Details: Below listed are the new log and error messages introduced by power ramp limiting feature.

      #ScenarioLog/error message
      1Enable power_ramprate_enable server attribute

      In server logs:

      attributes set: power_ramprate_enable = 1

      Log level: LOG_INFO

      2Nodes are being ramped down

      In server logs:


      Job;power_ramp_down;launch: /opt/cray/capmc/default/bin/capmc set_sleep_state_limit --nids 24-25 --limit 4

      Job;power_ramp_down;launch: finished

      Log level: LOG_INFO

      3Nodes are being ramped up

      In server logs:


      Job;power_ramp_up;launch: /opt/cray/capmc/default/bin/capmc set_sleep_state_limit --nids 24-25 --limit 0

      Job;power_ramp_up;launch: finished

      Log level: LOG_INFO

      4Server periodic hook output

      In server logs:

      power_ramp_limit: nodes to ramp up: <node_list>

      power_ramp_limit: nodes to ramp down: <node_list>

      Log level: LOG_INFO