Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Interface #1

...

Visibility: Public

...

Change Control: Stable

...

Synopsis: PBS hook power control module

...

A new class “pbs.Power” will be made available that will provide power functionality.  A hook will be able to access it via python import.

...

Reference to more detail on the interface. The following define the PMI operations available:

...

pmi_activate_profile

  1. Activate a given power profile on a set of hosts on behalf of a given job.  The parameter “profile_name” is a string containing the name of a profile.  The parameter “hosts” is a list containing strings that specify hostnames.  The parameter “job” is a PBS job object.  If the hosts parameter is not specified, the hosts will be calculated from the job object.  If the job parameter is not specified, the pbs.event().job object will be used.

  2. The return type is bool where True indicates success and False indicates the request was made without an indication from the PMI if it was successful or not.

  3. If an error occurs where it is appropriate for some or all of the job vnodes to be marked offline, this may be done before an exception is raised.

  4. If an error occurs where it is appropriate for the supported profile names for some or all of the job vnodes to be refreshed, this may be done before an exception is raised.

...

pmi_get_usage

  1. Retrieve power usage for a job.  The parameter “job” is a PBS job object.

  2. The return will be a float which gives the cumulative energy usage for the job at the time of the call in kilowatt-hours (kWh).  If no power usage information is available, None is returned.

...

pmi_deactivate_profile

  1. Inform the PMI that a job is no longer active.  This would be used when a job is suspended or terminated.  The parameter “job” is a PBS job object.  If it is not specified, the pbs.event().job object will be used.

  2. The return type is bool where True indicates success and False indicates the request was made without an indication from the PMI if it was successful or not.

...

pmi_query

  1. Return information that matches a request type.  The parameter “query_type” is used to specify what should be returned.  The only value for  query_type is PMI_QUERY_PROFILE, and the return will be a list of strings giving profile names supported by the PMI.

...

pmi_connect

  1. Connect to the PMI.  The parameter “endpoint” defaults to None and is a string which will be meaningful to the PMI.  The parameter “port” defaults to None and is an integer.  A typical usage would be “endpoint” specifying a hostname and “port” giving a network port for a network service connection.

  2. Currently the connection/disconnection will be done per hook instead of creating a long lasting session.

  3. Nothing is returned, the connection information is maintained in an instantiation of the Power class.

  4. If the endpoint or port parameters are not specified, the underlying code specific to the PMI will determine the connection details.

...

pmi_disconnect

  1. Disconnect from the PMI.  There are no parameters needed since each instance of the Power class is associated to a backend power management interface.

...

Exceptions

  1. InternalError - returned in cases where the underlying cause of a failure cannot be determined.

  2. BackendError - the backend PMI call was unsuccessful.

...

Power module initialization

  1. A string can optionally be passed to specify the name of the PMI to be used (see I.1.11).  By default, the type of PMI to be used will be determined automatically based on the type of hardware used.

Examples

Activate a profile from a job specific event.

Info
iconfalse

p = pbs.Power()

p.pmi_connect(“power_master”)

p.pmi_activate_profile(“LOW”)

p.pmi_disconnect()

Get profile name list.

Info
iconfalse

import pbs

p = pbs.Power()

p.pmi_connect(port=3564)

pnames = p.pmi_query(p.PMI_QUERY_PROFILE)

p.pmi_disconnect()

Deactivate profile on a specific job.

Info
iconfalse

import pbs

p = pbs.Power()

badjob = pbs.server().job(“10”)

p.pmi_connect()

p.pmi_deactivate_profile(job=badjob)

p.pmi_disconnect()

...

  1. Visibility: Public
  2. Change Control: Stable
  3. Synopsis: Generically applicable server power_provisioning  flag
  4. Reference to more detail on the interface.
    1. The  power_provisioning boolean server attribute will have a default of unset, be visible to all and changeable by an administrator.  When it is set True, PMI operations may take place if allowed by power_enable flag (see I.1.10).  If it is unset or set False, no PMI operations will take place on any vnode.
    2. Use qmgr to set the power_provisioning flag true or false.  For example:   qmgr -c “set server power_provisioning = true”

...

  1. Visibility: Public
  2. Change Control: Stable
  3. Synopsis: Generically applicable energy usage for a job
    1. Add a new attribute for a job: resources_used.energy
  4. Reference to more detail on the interface.
    1. The type will be float.
    2. The units will be kWh.  For example:  resources_used.energy=64.2
    3. The resources_used.energy value will only be updated when PMI operations are allowed on the vnodes used by the job.

...

  1. Visibility: Public
  2. Change Control: Stable
  3. Synopsis: Generically applicable resource “eoe”
    1. A new resource similar to “aoe” is added to both jobs and vnodes to specify the energy operational environment.
  4. Reference to more detail on the interface.
    1. Is added to default resource list of scheduler in sched_config file.
    2. It is a non-consumable resource.
    3. It is of type resource, added to attribute resources_available. e.g. resources_available.eoe=”low,med,high”. It is a string array.
    4. Contains list of all power profile names that are available on a vnode. By default, resources_available.eoe is unset.
    5. The list is visible to all but settable only by manager.
    6. Job Resource_List.eoe per chunk in –l select as –l select=1:ncpus:eoe=low.
    7. Only one eoe value can be active on a vnode at a time.
    8. A job Resource_List.eoe may be requested in a select statement but no more than one distinct value for the requested eoe is currently supported. i.e. -lselect=1:ncpus=1:eoe=med+1:ncpus=2:eoe=med
    9. If a Job request is made with more than one value for eoe (I.e. –l select=1:eoe=low+1:eoe=high), it will be rejected by qsub with the error “qsub: only one value of eoe is allowed”.
    10. A value for resources_available.eoe will not be automatically set on the system(s) where the PBS server and scheduler are running.
    11. If both an aoe and eoe are set for a job, the aoe setting will be processed first by the scheduler.
    12. The scheduler will not prempt a job with eoe set using suspend or checkpoint.

...


Technical Term

Description or Definition

PMI

Power Management Infrastructure



Tractability Matrix

Use Case(s)Requirement(s)Interface(s)
2.a3.a, 3.b, 3.e, 3.f2, 10, 11
2.b3.d, 3.g1, 4, 6, 7, 8, 9, 12, 24-38, 40-49
2.c3.c3, 5, 13, 18, 20, 21, 39



A. Interface changes

  1. Interface #1
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Generically applicable server power_provisioning  flag
    4. Reference to more detail on the interface.
      1. The  power_provisioning boolean server attribute will have a default of unset, be visible to all and changeable by a manager.  When it is set True, PMI operations may take place if allowed by vnode power_provisioning flag (see A.1.9).  If it is unset or set False, no PMI operations will take place on any vnode.
      2. Use qmgr to set the power_provisioning flag true or false.  For example:   

        Info
        iconfalse

        qmgr -c “set server power_provisioning = true”


  2. Interface #2
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Generically applicable energy usage for a job
      1. Add a new attribute for a job: resources_used.energy
    4. Reference to more detail on the interface.
      1. The type will be float.
      2. The units will be kWh.  For example:  resources_used.energy=64.2
      3. The resources_used.energy value will only be updated when PMI operations are allowed on the vnodes used by the job. The resources_used.energy value will not be seen in qstat -f output or server/accounting logs when PMI operations are not allowed on the node.
  3. Interface #3
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Generically applicable resource “eoe”
      1. A new resource similar to “aoe” is added to both jobs and vnodes to specify the energy operational environment.
    4. Reference to more detail on the interface.
      1. Is added to default resource list of scheduler in sched_config file.
      2. It is a non-consumable resource.
      3. It is of type resource, added to attribute resources_available. e.g. resources_available.eoe=”low,med,high”. It is a string array.
      4. Contains list of all power profile names that are available on a vnode. By default, resources_available.eoe is unset.
      5. The list is visible to all but settable only by manager.
      6. Job Resource_List.eoe per chunk in –l select as –l select=1:ncpus:eoe=low. This will request one chunk from a node with resource_available.eoe=low.
      7. Only one eoe value can be active on a vnode at a time.
      8. A job Resource_List.eoe may be requested in a select statement but no more than one distinct value for the requested eoe is currently supported. i.e. -lselect=1:ncpus=1:eoe=med+1:ncpus=2:eoe=med
      9. If a Job request is made with more than one value for eoe (I.e. –l select=1:eoe=low+1:eoe=high), it will be rejected by qsub with the error “qsub: only one value of eoe is allowed”.
      10. A value for resources_available.eoe will not be automatically set on the system(s) where the PBS server and scheduler are running.
      11. If both an aoe and eoe are set for a job, the aoe setting will be processed first by the scheduler.
      12. The scheduler will not preempt a job with eoe set using suspend or checkpoint.
  4. Interface #4
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Generically applicable vnode attribute: current_eoe
    4. Reference to more detail on the interface.
      1. Identifies the eoe active on a vnode. It is of type String. By default, it is unset. It is settable only by manager and visible to all
    PBS users
      1. .
      2. A
    .1.9.4.3               A negative value will result in an PBSE_BADATVAL error.
  5. A.1.9.4.4               DELETED
  6. A.1.9.5          Comments on the interface
  7. A.1.9.5.1               Standing of the interface: new interface
  8. A.1.9.5.2               Interface type: Other
  9. A.1.10   Interface #11
  10. A.1.10.1        Visibility: Public
  11. A.1.10.2        Change Control: Experimental
  12. A.1.10.3        Synopsis: Generically applicable vnode power enable flag
  13. A.1.10.4        Reference to more detail on the interface.
  14. A.1.10.4.1            The  power_enable boolean vnode attribute will have a default of unset, be visible to all and changeable by an administrator.
  15. A.1.10.4.2            Use qmgr to set the power_provisioning flag true or false.  For example:
  16.                                                 qmgr -c “set node bigbox power_enable = true”
  17. A.1.10.4.3            When it is set True, PMI operations may take place on the vnode.  If it is unset or set False, no PMI operations are allowed to take place on the vnode.
  18. A.1.10.5        Comments on the interface
  19. A.1.10.5.1            Standing of the interface: new interface
  20. A.1.10.5.2            Interface type: Other
  21. A.1.11   Interface #12
  22. A.1.11.1        Visibility: Private
  23. A.1.11.2        Change Control: Unstable
  24. A.1.11.3        Synopsis: Expose the hook PMI structure to allow additions to the supported PMI list.
  25. A.1.11.4        Reference to more detail on the interface.
  26. A.1.11.4.1            The PBS “power” hook can be modified to specify a PMI name in the pbs.Power() instantiation in the init_power function.  For example, the code below would cause the new file described in I.1.11.4.2 to be used by the hook:
  27.                                                 power = pbs.Power(“ipmitool”)
  28. A.1.11.4.2            Python code patterned after the file PBS_EXEC/lib/python/altair/pbs/v1/_pmi_none.py must be placed in a file where none is replaced by the PMI name being implemented.  For example:
  29.                                                 # cd $PBS_EXEC/lib/python/altair/pbs/v1
  30.                                                 # cp _pmi_none.py _pmi_ipmitool.py
  31.                                                 # vi _pmi_ipmitool.py
  32. A.1.11.4.3            The defined functions must all be present: __init__, _pmi_connect, _pmi_disconnect, _pmi_get_usage, _pmi_query, _pmi_activate_profile, _pmi_deactivate_profile.  These all have the same arguments as those in I.1.1 except the function name has an intial underbar ('_').
  33. A.1.11.5        Comments on the interface
  34. A.1.11.5.1            Standing of the interface: new interface
  35. A.1.11.5.2            Interface type: Other
  36. A.1.12   Interface #13
  37. A.1.12.1        Visibility: Public
  38. A.1.12.2        Change Control: Experimental
  39. A.1.12.3       
      1. job J1 running with a eoe setting X will cause the value of current_eoe to be set  to X on the vnodes assigned to J1 that allow PMI operations.
      2. Manually changing current_eoe is unsupported.
      3. The scheduler can run a job requesting an eoe on vnodes with a current_eoe value that matches the job eoe.
      4. The scheduler can only run a job on a vnode where the current_eoe does not match the job eoe if no jobs are running on the vnode and PMI operations are allowed on the vnode.
      5. When a job ends the deactivate operation will take place if all the vnodes used by the job have no other jobs running and allow PMI operations.  At this point, current_eoe will be unset on all the vnodes used by the job.
  40. Interface #5
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Cray specific resource: pstate
    4. Reference to more detail on the interface.
      1. Cray ALPS reservation setting for p-state.  See Basil 1.4 documentation.
      2. It is of type String. By default, it is unset. It is settable and visible to all PBS users.
  41. Interface #6
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Cray specific resource: pgov
    4. Reference to more detail on the interface.
      1. Cray ALPS reservation setting for p-governor.  See Basil 1.4 documentation.
      2. It is of type String. By default, it is unset. It is settable and visible to all PBS users.
  42. Interface #7
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Cray specific resource: pcap_node
    4. Reference to more detail on the interface.
      1. Cray capmc set_power_cap --node setting.  See capmc documentation.
      2. It is of type Int. By default, it is unset. It is settable and visible to all PBS users.
      3. A  negative value will result in a PBSE_BADATVAL error.
  43. Interface #8
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Cray specific resource: pcap_accelerator
    4. Reference to more detail on the interface.
      1. Cray capmc set_power_cap --accel setting.  See capmc documentation.
      2. It is of type Int. By default, it is unset. It is settable and visible to all PBS users.
      3. A negative value will result in a PBSE_BADATVAL error.
  44. Interface #9
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Generically applicable vnode power_provisioning flag
    4. Reference to more detail on the interface.
      1. The  power_provisioning boolean vnode attribute will be unset by default, be visible to all and changeable by a manager.
      2. Use qmgr to set the power_provisioning flag true or false.  For example:

        Info
        iconfalse

        qmgr -c “set node bigbox power_provisioning = true”                                             


      3. When it is set to True, PMI operations may take place on the vnode.  If it is unset or set to False, no PMI operations are allowed to take place on the vnode.

  45. Interface #10
    1. Visibility: Public
    2. Change Control: Stable
    3. Synopsis: Mom log using logjobmsg when a job ends and the value of current_eoe is unset.
    A.1.12.4       
    1. Reference to more detail on the interface.
        A.1.12.4.1           
          1. Example:

                                           
          1. Info
            iconfalse

            11/19/2014 14:44:15;0008;pbs_python;Job;165.bigcray;PMI: reset current_eoe

        A.1.12.5        Comments on the interface
      1. A.1.12.5.1            Standing of the interface: new interface
      2. A.1.12.5.2            Interface typeLog message
      3. A.1.13   Interface #14

      4. A.1.13.1        Interface #11
        1. Visibility: Public
        A.1.13.2       
        1. Change Control:
        ExperimentalA.1.13.3       
        1. Unstable
        2. Synopsis: When the energy for a job on an SGI HPE system is obtained, it will be logged by MoM
        using 
        1. using logjobmsg.
        A.1.13.4       
        1. Reference to more detail on the interface.
            A.1.13.4.1           
              1. Example:

                Info
                iconfalse

                11/06/2014 18:35:26;0008;pbs_python;Job;4856.iceberg;SGI HPE: energy 1.456kWh

          1. A.1.13.5        Comments on the interface
          2. A.1.13.5.1            Standing of the interface: new interface
          3. A.1.13.5.2            Interface typeLog message
          4. A.1.14   Interface #15

          5. A.1.14.1        Interface #12
            1. Visibility: Public
            A.1.14.2       
            1. Change Control:
            ExperimentalA.1.14.3       
            1. Unstable
            2. Synopsis: The Cray capmc command invocations will be logged by MoM using LOG_DEBUG with the keyword “launch”.
            A.1.14.4       
            1. Reference to more detail on the interface.
                A.1.14.4.1           
                  1. Example:

                    Info
                    iconfalse

                    11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;Cray: 167.bigcray launch: /opt/cray/capmc/default/bin/capmc get_node_energy_counter --nids 0

              1. A.1.14.5        Comments on the interface
              2. A.1.14.5.1            Standing of the interface: new interface
              3. A.1.14.5.2            Interface typeLog message
              4. A.1.15   Interface #16
              5. A.1.15.1        Visibility: Public
              6. A.1.15.2        Change Control: Experimental

              7. A.1.15.3        Interface #13
                1. Visibility: Public
                2. Change Control: Unstable
                3. Synopsis:  Following a successful Cray capmc invocation, a message will be logged by MoM using LOG_WARNING if the time used by capmc is greater than 30 seconds.
                A.1.15.4       
                1. Reference to more detail on the interface.
                    A.1.15.4.1           
                      1. Example:

                        Info
                        iconfalse

                        11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; 21.bigcray;launch: finished  in 156 seconds

                  1. A.1.15.5        Comments on the interface
                  2. A.1.15.5.1            Standing of the interface: new interface
                  3. A.1.15.5.2            Interface typeLog message
                  4. A.1.16   Interface #17
                  5. A.1.16.1        Visibility: Private

                  6. A.1.16.2        Interface #14
                    1. Visibility: Public
                    2. Change Control: Unstable
                    A.1.16.3       
                    1. Synopsis:  If Cray capmc writes anything to stderr, the first line will be logged by MoM using LOG_WARNING after the “launch” message.
                    A.1.16.4       
                    1. Reference to more detail on the interface.
                        A.1.16.4.1           
                          1. Cray has not documented the possible stderr output from capmc.
                        A.1.16.4.2           
                          1. Example:

                            Info
                            iconfalse

                            11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; 21.bigcray;launch stderr: i fell and cannot get up

                      1. A.1.16.5        Comments on the interface
                      2. A.1.16.5.1            Standing of the interface: new interface
                      3. A.1.16.5.2            Interface typeLog message
                      4. A.1.17   Interface #18
                      5. A.1.17.1        Visibility: Private

                      6. A.1.17.2        Interface #15
                        1. Visibility: Public
                        2. Change Control: Unstable
                        A.1.17.3       
                        1. Synopsis:  When Cray capmc is run with the argument “get_node_energy_counter”, the node count is checked and if it is wrong, a message will be logged by MoM using logjobmsg.
                        A.1.17.4       
                        1. Reference to more detail on the interface.
                            A.1.17.4.1           
                              1. The same command will be run one additional time if an error is seen.  No message will be logged for the first error.  If an error occurs after the second attempt, a message is logged.
                            A.1.17.4.2            A.1.17.4.3           
                              1. For example:

                                Info
                                iconfalse

                                11/19/2014 15:20:05;0008;pbs_python;Job;166.centos1;error: node count 2, should be 1

                           


                              1. The output from capmc should include a node count.  If it does not, the messages will show “not set” instead of a number.
                            A.1.17.4.4           
                              1. Example:

                                Info
                                iconfalse
                                1. 11/19/2014 15:20:05;0008;pbs_python;Job;166.centos1;node count not set
                            , should be 1

                           

                          ...

                          1. A.1.17.5        Comments on the interface
                          2. A.1.17.5.1            Standing of the interface: new interface
                          3. A.1.17.5.2            Interface typeLog message
                          4. A.1.18   Interface #19
                          5. A.1.18.1       
                                1. , should be 1


                          6. Interface #16
                            1. Visibility: Public
                            A.1.18.2       
                            1. Change Control:
                            ExperimentalA.1.18.3       
                            1. Unstable
                            2. Synopsis: If Cray RUR is configured (see
                            I
                            1. B.
                            2.
                            1. 1.
                            5
                            1. f), log messages will be logged by MoM using logjobmsg when a job ends.
                            A.1.18.4       
                            1. Reference to more detail on the interface.
                              1. A

                            .1.18.4.1            A
                              1. message will show the energy used by each aprun run by a job and a job tally in Joules.  For example:

                                Info
                                iconfalse

                                11/19/2014 18:17:16;0008;pbs_python;Job;267.bigcray;Cray:RUR: {"apid":34876,"apid_energy":83876J,"job_energy":83876J}

                                11/19/2014 18:17:16;0008;pbs_python;Job;267.bigcray;Cray:RUR: {"apid":34972,"apid_energy":84272J,"job_energy":168148J}

                                11/19/2014 18:17:16;0008;pbs_python;Job;267.bigcray;Cray:RUR: {"apid":35234,"apid_energy":83194J,"job

                            _energy":251342J}
                          7. A.1.18.5        Comments on the interface
                          8. A.1.18.5.1            Standing of the interface: new interface
                          9. A.1.18.5.2            Interface typeLog message
                          10. A.1.19   Interface #20
                          11. A.1.19.1        Visibility: Public
                          12. A.1.19.2        Change Control: Experimental
                          13. A.1.19.3       
                              1. _energy":251342J}


                          14. Interface #17
                            1. Visibility: Public
                            2. Change Control: Unstable
                            3. Synopsis: If Cray RUR is not configured, a log message will be logged by MoM using logjobmsg when a job ends.
                            A.1.19.4       
                            1. Reference to more detail on the interface.
                                A.1.19.4.1           
                                  1. Example:

                                    Info
                                    iconfalse

                                    11/19/2014 18:17:16;0008;pbs_python;Job;267.bigcray;Cray

                                : no RUR data
                              1. A.1.19.5        Comments on the interface
                              2. A.1.19.5.1            Standing of the interface: new interface
                              3. A.1.19.5.2            Interface typeLog message
                              4. A.1.20   Interface #21
                              5. A.1.20.1       
                                  1. : no RUR data


                              6. Interface #18
                                1. Visibility: Public
                                A.1.20.2       
                                1. Change Control:
                                ExperimentalA.1.20.3       
                                1. Unstable
                                2. Synopsis: At the end of a job on a Cray, the energy reported by capmc for the compute nodes used by the job will be logged  by MoM using logjobmsg.
                                A.1.20.4       
                                1. Reference to more detail on the interface.
                                    A.1.20.4.1           
                                      1. Example:

                                        Info
                                        iconfalse

                                        11/06/2014 18:35:26;0008;pbs_python;Job

                                    ;156.bigcray;energy usage 554520J
                                  1. A.1.20.5        Comments on the interface
                                  2. A.1.20.5.1            Standing of the interface: new interface
                                  3. A.1.20.5.2            Interface typeLog message
                                  4. A.1.21   Interface #22
                                  5. A.1.21.1        Visibility: Public
                                  6. A.1.21.2        Change Control: Experimental
                                  7. A.1.21.3       
                                      1. ;156.bigcray;energy usage 554520J


                                  8. Interface #19
                                    1. Visibility: Public
                                    2. Change Control: Unstable
                                    3. Synopsis: The energy reported by capmc for the compute nodes used by a job on a Cray will be logged by MoM using logjobmsg periodically every 5 minutes as the job runs.
                                    A.1.21.4       
                                    1. Reference to more detail on the interface.
                                        A.1.21.4.1           
                                          1. Example:

                                            Info
                                            iconfalse

                                            11/06/2014 18:35:26;0008;pbs_python;Job;156.bigcray;Cray

                                        : get_usage: energy 346342J
                                      1. A.1.21.5        Comments on the interface
                                      2. A.1.21.5.1            Standing of the interface: new interface
                                      3. A.1.21.5.2            Interface typeLog message
                                      4. A.1.22   Interface #23
                                      5. A.1.22.1        Visibility: Private
                                      6. A.1.22.2       
                                          1. : get_usage: energy 346342J


                                      7. Interface #20
                                        1. Visibility: Public
                                        2. Change Control: Unstable
                                        A.1.22.3       
                                        1. Synopsis: When the PMI on a Cray is initialized, MoM will log messages at LOG_DEBUG.
                                        A.1.22.4       
                                        1. Reference to more detail on the interface.
                                            A.1.22.4.1           
                                              1. Example:

                                                Info
                                                iconfalse

                                                11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;Cray: init

                                                11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;

                                            Cray: connect
                                          1. A.1.22.5        Comments on the interface
                                          2. A.1.22.5.1            Standing of the interface: new interface
                                          3. A.1.22.5.2            Interface typeLog message
                                          4. A.1.23   Interface #24
                                          5. A.1.23.1        Visibility: Private
                                          6. A.1.23.2       
                                              1. Cray: connect


                                          7. Interface #21
                                            1. Visibility: Public
                                            2. Change Control: Unstable
                                            A.1.23.3       
                                            1. Synopsis: When
                                            pmi_
                                            1. get_usage() is called for a job on a Cray, a message will be logged by MoM using logjobmsg.
                                            A.1.23.4       
                                            1. Reference to more detail on the interface.
                                                A.1.23.4.1           
                                                  1. Example:

                                                    Info
                                                    iconfalse

                                                    11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray

                                                : get_usage
                                              1. A.1.23.5        Comments on the interface
                                              2. A.1.23.5.1            Standing of the interface: new interface
                                              3. A.1.23.5.2            Interface typeLog message
                                              4. A.1.24   Interface #25
                                              5. A.1.24.1        Visibility: Private
                                              6. A.1.24.2       
                                                  1. : get_usage


                                              7. Interface #22
                                                1. Visibility: Public
                                                2. Change Control: Unstable
                                                A.1.24.3       
                                                1. Synopsis: When
                                                pmi_
                                                1. query() is called on a Cray, a message will be logged by MoM using LOG_DEBUG.
                                                A.1.24.4       
                                                1. Reference to more detail on the interface.
                                                    A.1.24.4.1           
                                                      1. Example:

                                                        Info
                                                        iconfalse

                                                        11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;Cray:

                                                    query
                                                  1. A.1.24.5        Comments on the interface
                                                  2. A.1.24.5.1            Standing of the interface: new interface
                                                  3. A.1.24.5.2            Interface typeLog message
                                                  4. A.1.25   Interface #26
                                                  5. A.1.25.1        Visibility: Private
                                                  6. A.1.25.2       
                                                      1. query


                                                  7. Interface #23
                                                    1. Visibility: Public
                                                    2. Change Control: Unstable
                                                    A.1.25.3       
                                                    1. Synopsis: When
                                                    pmi_
                                                    1. activate_profile() is called on a Cray, a message will be logged by MoM using LOG_DEBUG.
                                                    A.1.25.4       
                                                    1. Reference to more detail on the interface.
                                                        A.1.25.4.1           
                                                          1. Example:

                                                            Info
                                                            iconfalse

                                                            11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;Cray: 167.centos1

                                                        activate 'low'
                                                      1. A.1.25.5        Comments on the interface
                                                      2. A.1.25.5.1            Standing of the interface: new interface
                                                      3. A.1.25.5.2            Interface typeLog message
                                                      4. A.1.26   Interface #27
                                                      5. A.1.26.1        Visibility: Private
                                                      6. A.1.26.2       
                                                          1. activate 'low'


                                                      7. Interface #24
                                                        1. Visibility: Public
                                                        2. Change Control: Unstable
                                                        A.1.26.3       
                                                        1. Synopsis: When
                                                        pmi_
                                                        1. activate_profile() is called on a Cray but no compute nodes are allocated to the job, a message will be logged by MoM using logjobmsg.
                                                        A.1.26.4       
                                                        1. Reference to more detail on the interface.
                                                            A.1.26.4.1           
                                                              1. Example:

                                                                Info
                                                                iconfalse

                                                                11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: no compute

                                                            nodes for power setting
                                                          1. A.1.26.5        Comments on the interface
                                                          2. A.1.26.5.1            Standing of the interface: new interface
                                                          3. A.1.26.5.2            Interface typeLog message
                                                          4. A.1.27   Interface #28
                                                          5. A.1.27.1        Visibility: Private
                                                          6. A.1.27.2       
                                                              1. nodes for power setting


                                                          7. Interface #25
                                                            1. Visibility: Public
                                                            2. Change Control: Unstable
                                                            A.1.27.3       
                                                            1. Synopsis: When
                                                            pmi_
                                                            1. activate_profile() is called on a Cray and the job has pcap_node set, a message will be logged by MoM using logjobmsg showing the pcap_node value.
                                                            A.1.27.4       
                                                            1. Reference to more detail on the interface.
                                                                A.1.27.4.1           
                                                                  1. Example:

                                                                    Info
                                                                    iconfalse

                                                                    11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray

                                                                : pcap node 350
                                                              1. A.1.27.5        Comments on the interface
                                                              2. A.1.27.5.1            Standing of the interface: new interface
                                                              3. A.1.27.5.2            Interface typeLog message
                                                              4. A.1.28   Interface #29
                                                              5. A.1.28.1        Visibility: Private
                                                              6. A.1.28.2       
                                                                  1. : pcap node 350


                                                              7. Interface #26
                                                                1. Visibility: Public
                                                                2. Change Control: Unstable
                                                                A.1.28.3       
                                                                1. Synopsis: When
                                                                pmi_
                                                                1. activate_profile() is called on a Cray and the job has pcap_accelerator set, a message will be logged by MoM using logjobmsg showing the pcap_ accelerator value.
                                                                A.1.28.4       
                                                                1. Reference to more detail on the interface.
                                                                    A.1.28.4.1           
                                                                      1. Example:

                                                                        Info
                                                                        iconfalse

                                                                        11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray

                                                                    : pcap accel 250
                                                                  1. A.1.28.5        Comments on the interface
                                                                  2. A.1.28.5.1            Standing of the interface: new interface
                                                                  3. A.1.28.5.2            Interface typeLog message
                                                                  4. A.1.29   Interface #30
                                                                  5. A.1.29.1        Visibility: Private
                                                                  6. A.1.29.2       
                                                                      1. : pcap accel 250


                                                                  7. Interface #27
                                                                    1. Visibility: Public
                                                                    2. Change Control: Unstable
                                                                    A.1.29.3       
                                                                    1. Synopsis: When
                                                                    pmi_
                                                                    1. activate_profile() is called on a Cray and the job has neither pcap_node or pcap_accelerator set, a message will be logged by MoM using logjobmsg.
                                                                    A.1.29.4       
                                                                    1. Reference to more detail on the interface.
                                                                        A.1.29.4.1           
                                                                          1. Example:

                                                                            Info
                                                                            iconfalse

                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: no power cap to set

                                                                      1. A.1.29.5        Comments on the interface
                                                                      2. A.1.29.5.1            Standing of the interface: new interface
                                                                      3. A.1.29.5.2            Interface typeLog message
                                                                        A.1.30  

                                                                      4. Interface #31A.1.30.1        #28A.1.30.2       
                                                                        1. Visibility:
                                                                        Private
                                                                        1. Public
                                                                        2. Change Control: Unstable
                                                                        A.1.30.3       
                                                                        1. Synopsis: When
                                                                        pmi_
                                                                        1. deactivate_profile() is called on a Cray, a message will be logged by MoM using LOG_DEBUG.
                                                                        A.1.30.4       
                                                                        1. Reference to more detail on the interface.
                                                                            A.1.30.4.1           
                                                                              1. Example:

                                                                                Info
                                                                                iconfalse

                                                                                11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;Cray: deactivate 167.centos1

                                                                          1. A.1.30.5        Comments on the interface
                                                                          2. A.1.30.5.1            Standing of the interface: new interface
                                                                          3. A.1.30.5.2            Interface typeLog message
                                                                          4. A.1.31   Interface #32
                                                                          5. A.1.31.1        Visibility: Private

                                                                          6. A.1.31.2        Interface #29
                                                                            1. Visibility: Public
                                                                            2. Change Control: Unstable
                                                                            A.1.31.3       
                                                                            1. Synopsis: When
                                                                            pmi_
                                                                            1. deactivate_profile() is called on a Cray but no compute nodes are allocated to the job, a message will be logged by MoM using logjobmsg.
                                                                            A.1.31.4       
                                                                            1. Reference to more detail on the interface.
                                                                                A.1.31.4.1           
                                                                                  1. Example:

                                                                                    Info
                                                                                    iconfalse

                                                                                    11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: no compute nodes for power setting

                                                                              1. A.1.31.5        Comments on the interface
                                                                              2. A.1.31.5.1            Standing of the interface: new interface
                                                                              3. A.1.31.5.2            Interface typeLog message
                                                                              4. A.1.32   Interface #33
                                                                              5. A.1.32.1        Visibility: Private

                                                                              6. A.1.32.2        Interface #30
                                                                                1. Visibility: Public
                                                                                2. Change Control: Unstable
                                                                                A.1.32.3       
                                                                                1. Synopsis: When
                                                                                pmi_
                                                                                1. deactivate_profile() is called on a Cray and the job has pcap_node set, a message will be logged by MoM using logjobmsg showing the pcap_node value.
                                                                                A.1.32.4       
                                                                                1. Reference to more detail on the interface.
                                                                                    A.1.32.4.1           
                                                                                      1. Example:

                                                                                        Info
                                                                                        iconfalse

                                                                                        11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: remove pcap node 350

                                                                                  1. A.1.32.5        Comments on the interface
                                                                                  2. A.1.32.5.1            Standing of the interface: new interface
                                                                                  3. A.1.32.5.2            Interface typeLog message
                                                                                  4. A.1.33   Interface #34
                                                                                  5. A.1.33.1        Visibility: Private

                                                                                  6. A.1.33.2        Interface #31
                                                                                    1. Visibility: Public
                                                                                    2. Change Control: Unstable
                                                                                    A.1.33.3       
                                                                                    1. Synopsis: When
                                                                                    pmi_
                                                                                    1. deactivate_profile() is called on a Cray and the job has pcap_accelerator set, a message will be logged by MoM using logjobmsg showing the pcap_ accelerator value.
                                                                                    A.1.33.4       
                                                                                    1. Reference to more detail on the interface.
                                                                                        A.1.33.4.1           
                                                                                          1. Example:

                                                                                            Info
                                                                                            iconfalse

                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: remove pcap accel 250

                                                                                      1. A.1.33.5        Comments on the interface
                                                                                      2. A.1.33.5.1            Standing of the interface: new interface
                                                                                      3. A.1.33.5.2            Interface typeLog message
                                                                                      4. A.1.34   Interface #35
                                                                                      5. A.1.34.1        Visibility: Private

                                                                                      6. A.1.34.2        Interface #32
                                                                                        1. Visibility: Public
                                                                                        2. Change Control: Unstable
                                                                                        A.1.34.3       
                                                                                        1. Synopsis: When
                                                                                        pmi_
                                                                                        1. deactivate_profile() is called on a Cray and the job has neither pcap_node or pcap_accelerator set, a message will be logged by MoM using logjobmsg.
                                                                                        A.1.34.4        Reference
                                                                                        1.  Reference to more detail on the interface.
                                                                                            A.1.34.4.1           
                                                                                              1. Example:

                                                                                                Info
                                                                                                iconfalse

                                                                                                11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: no power cap to remove

                                                                                          1. A.1.34.5        Comments on the interface
                                                                                          2. A.1.34.5.1            Standing of the interface: new interface
                                                                                          3. A.1.34.5.2            Interface typeLog message
                                                                                          4. A.1.35   Interface #36
                                                                                          5. A.1.35.1        Visibility: Private

                                                                                          6. A.1.35.2        Interface #33
                                                                                            1. Visibility: Public
                                                                                            2. Change Control: Unstable
                                                                                            A.1.35.3       
                                                                                            1. Synopsis:  If Cray RUR is configured but the file created by the output plugin has a permission problem, a message will be logged by MoM using logjobmsg.
                                                                                            A.1.35.4       
                                                                                            1. Reference to more detail on the interface.
                                                                                                A.1.35.4.1           
                                                                                                  1. The file owner must be 0 and it must not be writable by other.
                                                                                                A.1.35.4.2           
                                                                                                  1. Example:

                                                                                                    Info
                                                                                                    iconfalse

                                                                                                    11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: RUR file:  /var/spool/PBS/spool/167.centos1.rur should only be writable by root

                                                                                              1. A.1.35.5        Comments on the interface
                                                                                              2. A.1.35.5.1            Standing of the interface: new interface
                                                                                              3. A.1.35.5.2            Interface typeLog message
                                                                                              4. A.1.36   Interface #37
                                                                                              5. A.1.36.1        Visibility: Private

                                                                                              6. A.1.36.2        Interface #34
                                                                                                1. Visibility: Public
                                                                                                2. Change Control: Unstable
                                                                                                A.1.36.3       
                                                                                                1. Synopsis:  If Cray RUR is configured but the file created by the output plugin can be read, a message will be logged by MoM using logjobmsg.
                                                                                                A.1.36.4       
                                                                                                1. Reference to more detail on the interface.
                                                                                                  1. A.1.36.4.1            Example
                                                                                                      1. Example

                                                                                                        Info
                                                                                                        iconfalse

                                                                                                        11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray: reading RUR file:  /var/spool/PBS/spool/167.centos1.rur

                                                                                                  2. A.1.36.5        Comments on the interface
                                                                                                  3. A.1.36.5.1            Standing of the interface: new interface
                                                                                                  4. A.1.36.5.2            Interface typeLog message
                                                                                                  5. A.1.37   Interface #38
                                                                                                  6. A.1.37.1        Visibility: Private
                                                                                                  7. A.1.37.2        Change Control: Unstable

                                                                                                  8. A.1.37.3        Interface #35
                                                                                                    1. Visibility: Public
                                                                                                    2. Change Control: Unstable
                                                                                                    3. Synopsis: If the file created by the RUR output plugin can be read but the energy value cannot be parsed, a message will be logged by MoM using logjobmsg.
                                                                                                    A.1.37.4       
                                                                                                    1. Reference to more detail on the interface.
                                                                                                      1. A
                                                                                                    .1.37.4.1            A
                                                                                                      1. python exception error string will be output as part of the message.
                                                                                                  9. A.1.37.4.2            Example
                                                                                                      1. Example

                                                                                                        Info
                                                                                                        iconfalse

                                                                                                        11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray:RUR: energy_used not found: unexpected EOF while parsing

                                                                                                  10. A.1.37.5        Comments on the interface
                                                                                                  11. A.1.37.5.1            Standing of the interface: new interface
                                                                                                  12. A.1.37.5.2            Interface typeLog message
                                                                                                  13. A.1.38   Interface #39
                                                                                                  14. A.1.38.1        Visibility: Private
                                                                                                  15. A.1.38.2        Change Control: Unstable

                                                                                                  16. A.1.38.3        Interface #36
                                                                                                    1. Visibility: Public
                                                                                                    2. Change Control: Unstable
                                                                                                    3. Synopsis: If the file created by the RUR output plugin can be read but the Cray energy RUR plugin has not been enabled, a message will be logged by MoM using logjobmsg.
                                                                                                    A.1.38.4       
                                                                                                    1. Reference to more detail on the interface.
                                                                                                        A.1.38.4.1           
                                                                                                          1. Example

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray:RUR: warning: energy plugin not enabled by RUR

                                                                                                        A.1.38.5        Comments on the interface
                                                                                                      1. A.1.38.5.1            Standing of the interface: new interface
                                                                                                      2. A.1.38.5.2            Interface typeLog message
                                                                                                      3. A.1.39   Interface #40
                                                                                                      4. A.1.39.1        Visibility: Private

                                                                                                      5. A.1.39.2        Interface #37
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        A.1.39.3       
                                                                                                        1. Synopsis: When the energy for a job is successfully obtained from RUR, MOM will log one of three possible messages using logjobmsg
                                                                                                        .A.1.39.4       
                                                                                                        1. .
                                                                                                        2. Reference to more detail on the interface.
                                                                                                      6. A.1.39.4.1            If no energy value was obtained from capmc:
                                                                                                          1. If no energy value was obtained from capmc:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray:RUR: energy 4.234kWh


                                                                                                          2. If the energy value from capmc was smaller than what was obtained from RUR:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray:RUR: energy 4.234kWh replaces capmc energy 4.1432kWh


                                                                                                          3. If the energy value from capmc was greater than or equal to what was obtained from RUR:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;Cray:RUR: energy 4.234kWh

                                                                                                        A.1.39.4.2            If the energy value from capmc was smaller than what was obtained from RUR:
                                                                                                          1. last capmc usage 4.2432kWh


                                                                                                      7. Interface #38
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        3. Synopsis: When the PMI on an SGI HPE is initialized, MoM will log messages at LOG_DEBUG.
                                                                                                        4. Reference to more detail on the interface.
                                                                                                          1. Example:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;SGI HPE: init

                                                                                                            11/19/2014

                                                                                                        17
                                                                                                          1. 15:

                                                                                                        24
                                                                                                          1. 20:

                                                                                                        21
                                                                                                          1. 58;

                                                                                                        0008
                                                                                                          1. 0006;pbs_python;

                                                                                                        Job;167.centos1;Cray:RUR: energy 4.234kWh replaces capmc energy 4.1432kWh
                                                                                                      8. A.1.39.4.3            If the energy value from capmc was greater than or equal to what was obtained from RUR:
                                                                                                          1. Hook;pbs_python;SGI HPE: connect


                                                                                                      9. Interface #39
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        3. Synopsis: When get_usage() is called for a job on an SGI HPE, a message will be logged by MoM using logjobmsg.
                                                                                                        4. Reference to more detail on the interface.
                                                                                                          1. Example:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;

                                                                                                        Cray:RUR: energy 4.234kWh last capmc usage 4.2432kWh
                                                                                                      10. A.1.39.5        Comments on the interface
                                                                                                      11. A.1.39.5.1            Standing of the interface: new interface
                                                                                                      12. A.1.39.5.2            Interface typeLog message
                                                                                                      13. A.1.40   Interface #41
                                                                                                      14. A.1.40.1        Visibility: Private
                                                                                                      15. A.1.40.2        Change Control: Unstable
                                                                                                      16. A.1.40.3        Synopsis: When the PMI on an SGI is initialized, MoM will log messages at LOG_DEBUG.
                                                                                                      17. A.1.40.4        Reference to more detail on the interface.
                                                                                                      18. A.1.40.4.1            Example:
                                                                                                      19. 11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;SGI: init
                                                                                                      20. 11/19/2014 15:20:58
                                                                                                          1. SGI HPE: get_usage


                                                                                                      21. Interface #40
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        3. Synopsis: When query() is called on an SGI HPE, a message will be logged by MoM using LOG_DEBUG.
                                                                                                        4. Reference to more detail on the interface.
                                                                                                          1. Example:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 15:20:58;0006;pbs_python;Hook;pbs_python;SGI HPE: query


                                                                                                      22. Interface #41
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        3. Synopsis: When activate_profile() is called on an SGI HPE, a message will be logged by MoM using LOG_DEBUG.
                                                                                                        4. Reference to more detail on the interface.
                                                                                                          1. Example:

                                                                                                            Info
                                                                                                            iconfalse

                                                                                                            11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;SGI HPE:

                                                                                                        connect
                                                                                                      23. A.1.40.5        Comments on the interface
                                                                                                      24. A.1.40.5.1            Standing of the interface: new interface
                                                                                                      25. A.1.40.5.2            Interface typeLog message
                                                                                                      26. A.1.41   Interface #42
                                                                                                      27. A.1.41.1        Visibility: Private
                                                                                                      28. A.1.41.2       
                                                                                                          1. 167.centos1 activate '450W'


                                                                                                      29. Interface #42
                                                                                                        1. Visibility: Public
                                                                                                        2. Change Control: Unstable
                                                                                                        A.1.41.3       
                                                                                                        1. Synopsis: When
                                                                                                        pmi_get_usage
                                                                                                        1. deactivate_profile() is called
                                                                                                        for a job
                                                                                                        1. on an SGI HPE, a message will be logged by MoM using
                                                                                                        logjobmsg
                                                                                                        1. LOG_DEBUG.
                                                                                                        A.1.41.4       
                                                                                                        1. Reference to more detail on the interface.
                                                                                                            A.1.41.4.1           
                                                                                                              1. Example:

                                                                                                                Info
                                                                                                                iconfalse

                                                                                                                11/19/2014 17:24:

                                                                                                            21
                                                                                                              1. 18;

                                                                                                            0008;pbs_python;Job;167.centos1;SGI: get_usage
                                                                                                          1. A.1.41.5        Comments on the interface
                                                                                                          2. A.1.41.5.1            Standing of the interface: new interface
                                                                                                          3. A.1.41.5.2            Interface typeLog message
                                                                                                          4. A.1.42   Interface #43
                                                                                                          5. A.1.42.1        Visibility: Private
                                                                                                          6. A.1.42.2       
                                                                                                              1. 0006;pbs_python;Hook;pbs_python;SGI HPE: deactivate


                                                                                                          7. Interface #43
                                                                                                            1. Visibility: Public
                                                                                                            2. Change Control: Unstable
                                                                                                            A.1.42.3        Synopsis: When pmi_query is called on an SGI
                                                                                                            1. Synopsis: If any PMI operation is attempted for a job with a vnode assigned that does not have power_provisioning=True, a message will be logged by MoM using
                                                                                                            LOG_DEBUG
                                                                                                            1. logjobmsg.
                                                                                                            A.1.42.4       
                                                                                                            1. Reference to more detail on the interface.
                                                                                                            A.1.42.4.1            Example:
                                                                                                              1. Example

                                                                                                                Info
                                                                                                                iconfalse

                                                                                                                11/19/2014

                                                                                                            15
                                                                                                              1. 17:

                                                                                                            20
                                                                                                              1. 24:

                                                                                                            58
                                                                                                              1. 21;

                                                                                                            0006
                                                                                                              1. 0008;pbs_python;

                                                                                                            Hook;pbs_python;SGI: query
                                                                                                          8. A.1.42.5        Comments on the interface
                                                                                                          9. A.1.42.5.1            Standing of the interface: new interface
                                                                                                          10. A.1.42.5.2            Interface typeLog message
                                                                                                          11. A.1.43   Interface #44
                                                                                                          12. A.1.43.1        Visibility: Private
                                                                                                          13. A.1.43.2       
                                                                                                              1. Job;167.centos1; power functionality is disabled on vnode v12


                                                                                                          14. Interface #44A.1.43.4       
                                                                                                            1. Visibility: Public
                                                                                                            2. Change Control: Unstable
                                                                                                          15. A.1.43.3        Synopsis: When pmi_activate_profile is called on an SGI, a message will be logged by MoM using LOG_DEBUG.
                                                                                                            1. Synopsis: If the PMI hook is run with an unexpected event,  MoM will log a message at LOG_WARNING.
                                                                                                            2. Reference to more detail on the interface.
                                                                                                                A.1.43.4.1            Example:
                                                                                                                  1. Example

                                                                                                                    Info
                                                                                                                    iconfalse

                                                                                                                    11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;

                                                                                                                SGI: 167.centos1 activate '450W'
                                                                                                              1. A.1.43.5        Comments on the interface
                                                                                                              2. A.1.43.5.1            Standing of the interface: new interface
                                                                                                              3. A.1.43.5.2            Interface typeLog message
                                                                                                              4. A.1.44   Interface #45
                                                                                                              5. A.1.44.1        Visibility: Private
                                                                                                              6. A.1.44.2       
                                                                                                                  1. Event not serviceable for power provisioning.


                                                                                                              7. Interface #45
                                                                                                                1. Visibility: Public
                                                                                                                2. Change Control: Unstable
                                                                                                                A.1.44.3       
                                                                                                                1. Synopsis: When
                                                                                                                pmi_deactivate_profile is called on an SGI, a message will be logged by MoM using
                                                                                                                1. the PMI hook handles the EXECHOST_STARTUP event and the MOM is running on the same host as the pbs_server or pbs_sched, MoM will log a message at LOG_DEBUG.
                                                                                                                A.1.44.4       
                                                                                                                1. Reference to more detail on the interface.
                                                                                                                    A.1.44.4.1            Example:
                                                                                                                      1. Example

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;

                                                                                                                    SGI: deactivate
                                                                                                                  1. A.1.44.5        Comments on the interface
                                                                                                                  2. A.1.44.5.1            Standing of the interface: new interface
                                                                                                                  3. A.1.44.5.2            Interface typeLog message
                                                                                                                  4. A.1.45   Interface #46
                                                                                                                  5. A.1.45.1        Visibility: Private
                                                                                                                  6. A.1.45.2       
                                                                                                                      1. Provisioning cannot be enabled on this host.


                                                                                                                  7. Interface #46
                                                                                                                    1. Visibility: Public
                                                                                                                    2. Change Control: Unstable
                                                                                                                    A.1.45.3       
                                                                                                                    1. Synopsis: If any PMI operation
                                                                                                                    is attempted for a job with a vnode assigned that does not have power_enable=True
                                                                                                                    1. at job end throws a python exception, a message will be logged by MoM using logjobmsg showing the exception string.
                                                                                                                    A.1.45.4       
                                                                                                                    1. Reference to more detail on the
                                                                                                                    interface.A.1.45.4.1            Example
                                                                                                                    1. interface.
                                                                                                                      1. Example

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1;

                                                                                                                    power functionality is disabled on vnode v12
                                                                                                                  8. A.1.45.5        Comments on the interface
                                                                                                                  9. A.1.45.5.1            Standing of the interface: new interface
                                                                                                                  10. A.1.45.5.2            Interface typeLog message
                                                                                                                  11. A.1.46   Interface #47
                                                                                                                  12. A.1.46.1        Visibility: Private
                                                                                                                  13. A.1.46.2        Change Control: Unstable
                                                                                                                  14. A.1.46.3        Synopsis: If the PMI hook is run with an unexpected event,  MoM will log a message at LOG_WARNING.
                                                                                                                  15. A.1.46.4        Reference to more detail on the interface.
                                                                                                                  16. A.1.46.4.1            Example
                                                                                                                  17. 11
                                                                                                                      1. socket.error: [Errno 111] Connection refused


                                                                                                                  18. Interface #47
                                                                                                                    1. Visibility: Public
                                                                                                                    2. Change Control: Unstable
                                                                                                                    3. Synopsis: If activate_profile() throws either of the python exceptions defined in D.1.d.vii a message will be logged by MoM at LOG_WARNING.
                                                                                                                    4. Reference to more detail on the interface.
                                                                                                                      1. If the exception is BackendError, query() is called to reset the eoe value for the natural vnode for the MoM.
                                                                                                                      2. Example

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        1/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; PMI:activate: set eoe: low,med,high


                                                                                                                      3. If the exception is InternalError, the natural vnode for the MoM will be set offline.
                                                                                                                      4. Example

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        1/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python;

                                                                                                                    Event not serviceable for power provisioning.
                                                                                                                  19. A.1.46.5        Comments on the interface
                                                                                                                  20. A.1.46.5.1            Standing of the interface: new interface
                                                                                                                  21. A.1.46.5.2            Interface typeLog message
                                                                                                                  22. A.1.47   Interface #48
                                                                                                                  23. A.1.47.1        Visibility: Private
                                                                                                                  24. A.1.47.2        Change Control: Unstable
                                                                                                                  25. A.1.47.3        Synopsis: When the PMI hook handles the EXECHOST_STARTUP event and the MOM is running on the same host as the pbs_server or pbs_sched, MoM will log a message at LOG_DEBUG.
                                                                                                                  26. A.1.47.4        Reference to more detail on the interface.
                                                                                                                  27. A.1.47.4.1            Example
                                                                                                                  28. 11/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; Provisioning cannot be enabled on this host.
                                                                                                                  29. A.1.47.5        Comments on the interface
                                                                                                                  30. A.1.47.5.1            Standing of the interface: new interface
                                                                                                                  31. A.1.47.5.2            Interface typeLog message
                                                                                                                  32. A.1.48   Interface #49
                                                                                                                  33. A.1.48.1        Visibility: Private
                                                                                                                  34. A.1.48.2        Change Control: Unstable
                                                                                                                  35. A.1.48.3        Synopsis: If any PMI operation at job end throws a python exception, a message will be logged by MoM using logjobmsg showing the exception string.
                                                                                                                  36. A.1.48.4        Reference to more detail on the interface.
                                                                                                                  37. A.1.48.4.1            Example
                                                                                                                  38. 11/19/2014 17:24:21;0008;pbs_python;Job;167.centos1; socket.error: [Errno 111] Connection refused
                                                                                                                  39. A.1.48.5        Comments on the interface
                                                                                                                  40. A.1.48.5.1            Standing of the interface: new interface
                                                                                                                  41. A.1.48.5.2            Interface typeLog message
                                                                                                                  42. A.1.49   Interface #50
                                                                                                                  43. A.1.49.1        Visibility: Private
                                                                                                                  44. A.1.49.2        Change Control: Unstable
                                                                                                                  45. A.1.49.3        Synopsis: If pmi_activate_profile throws either of the python exceptions defined in I.1.1.4.7, a message will be logged by MoM at LOG_WARNING.
                                                                                                                  46. A.1.49.4        Reference to more detail on the interface.
                                                                                                                  47. A.1.49.4.1            If the exception is BackendError, pmi_query is called to reset the eoe value for the natural vnode for the MoM.
                                                                                                                  48. A.1.49.4.2            Example
                                                                                                                  49. 1/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; PMI:activate: set eoe: low,med,high
                                                                                                                  50. A.1.49.4.3            If the exception is InternalError, the natural vnode for the MoM will be set offline.
                                                                                                                  51. A.1.49.4.4            Example
                                                                                                                  52. 1/19/2014 17:24:18;0006;pbs_python;Hook;pbs_python; PMI:activate: set myself offline
                                                                                                                  53. A.1.49.5        Comments on the interface
                                                                                                                  54. A.1.49.5.1            Standing of the interface: new interface
                                                                                                                  55. A.1.49.5.2            Interface typeLog message
                                                                                                                  56. A.2  Administrator’s instructions
                                                                                                                  57. A.2.1      Installation does not require any new or different steps.  Once the PBS installation is complete, additional steps will be needed to enable power functionality.
                                                                                                                  58. A.2.1.1          Set power_provisioning on the server to true, and power_enable to true on the desired vnodes.  For example:
                                                                                                                  59.                                     # qmgr -c “set server power_provisioning = True”
                                                                                                                  60.                                     # qmgr -c “set node node1 power_enable = True”
                                                                                                                  61.                                     # qmgr -c “set node node2 power_enable = True”
                                                                                                                  62.                                     # qmgr -c “set node node3 power_enable = True”
                                                                                                                  63. If all vnodes will have power_enable set, @default can be used instead of individual vnode names.  For example:
                                                                                                                  64.             # qmgr -c “set node @default power_enable = True”
                                                                                                                  65. A.2.1.2          If eoe values are not provided by the PMI, additional steps are needed.
                                                                                                                  66. A.2.1.2.1               Use qmgr to set  resources_available.eoe for vnodes.
                                                                                                                  67. A.2.1.2.2               Import a submit hook that will map the eoe values to job resources.  See I.2.4.5.
                                                                                                                  68. A.2.1.3          To disable power provisioning, set  power_provisioning to false.
                                                                                                                  69. A.2.1.3.1               If power_provisioning is set to false while jobs are running, the running jobs would not have their profile deactivated when they finished and the resources_used.energy value would not be set at the end of the job.
                                                                                                                  70. A.2.1.3.2               DELETED
                                                                                                                  71. A.2.1.4          To disable power provisioning on selected vnodes, set power_enable on the vnodes to False.
                                                                                                                  72. A.2.1.5          An optional step on a Cray system can be preformed to use the RUR system to obtain the energy used by each aprun.  Please see the Cray document “Managing System Software for the Cray® Linux Environment” http://docs.cray.com/books/S-2393-5101/S-2393-5101.pdf.  Chapter 12 will provide information on RUR.
                                                                                                                  73. A.2.1.5.1               The RUR config file has to be modified to use the PBS output plugin:
                                                                                                                  74. /opt/pbs/default/lib/cray/pbs_output.py
                                                                                                                  75. A.2.1.5.2                
                                                                                                                  76. A.2.1.6          If eoe values are provided by the PMI, additional steps are needed to allow the server and MOMs to communicate the eoe values.
                                                                                                                  77. A.2.1.6.1               DELETED
                                                                                                                  78. A.2.1.6.2               After setting power_enable on the desired vnodes, restart or HUP the MOMs.
                                                                                                                  79. A.2.1.6.3               Check that resources_available.eoe values are set on the vnodes.  It may take up to two minutes for the eoe values to be reported to the server.  If any vnodes do not report eoe values, restart or HUP the MOMs a second time for the vnodes missing eoe values.
                                                                                                                  80. A.2.2      An upgrade may require some additional steps.
                                                                                                                  81. A.2.2.1           If a job prologue script is defined as described in the PBS  Professional Administrator's Guide section 12.4.4, this must be converted into an execjob_prologue hook before power provisioning can be enabled.  A prologue script will no longer run after power_provisioning is enabled.
                                                                                                                  82. A.2.3      If any host is running PBS with an alternate location for the pbs.conf file, PBS_CONF_FILE must be added to the pbs_environment file on that host.  On Linux systems, the default location for the pbs.conf file is /etc/pbs.conf.  The pbs.conf file is used by each MOM to check if the server or scheduler is running on the local host.  If so, the node will not be automatically configured for power provisioning.  For example, if /var/pbs.conf is the active pbs.conf file, the following line must be added to PBS_HOME/pbs_environment:
                                                                                                                  83.                                     PBS_CONF_FILE=/var/pbs.conf
                                                                                                                  84. A.2.4      New behavior
                                                                                                                  85. A.2.4.1          When the  power_provisioning server attribute is set true, the PBS MOM will detect and use the PMI on the system where it is running.  The PMIs supported are:
                                                                                                                  86. A.2.4.1.1               SGI Event Driven Framework part of SGI Management software
                                                                                                                  87. A.2.4.1.2               CRAY capmc on XC30 hardware platforms with SMW software release 7.0.UP03 and later.
                                                                                                                  88. A.2.4.2          The instructions to add or change power profile information are provided by the PMI provider if they are supported.  Here is a list of PMI vendors that support named power profiles.
                                                                                                                  89. A.2.4.2.1               SGI
                                                                                                                  90. A.2.4.3          If the PMI does not support named power profiles, the resources_available.eoe should be set manually for all the nodes to give a list power profiles.  The eoe values will be mapped to a set of options that are specific to the PMI (see I.2.4.5).  Here is a list of PMI providers that do not support named power profiles.  The options available for each provider follow their name.
                                                                                                                  91. A.2.4.3.1               CRAY
                                                                                                                  92. A.2.4.3.1.1   pstate: a value for the ALPS reservation p-state.
                                                                                                                  93. A.2.4.3.1.2   pgov: a value for the ALPS reservation p-governor value.
                                                                                                                  94. A.2.4.3.1.3   pcap_node: a power cap value for each job node in watts.
                                                                                                                  95. A.2.4.3.1.4   pcap_accelerator: a power cap value for node accelerator.
                                                                                                                  96. A.2.4.4          If the PMI power profile names are obtained from one of the vendors listed in I.2.4.2, then the resources_available.eoe values will be set automatically when power_provisioning is True.
                                                                                                                  97. A.2.4.4.1               This will occur when MOM starts.
                                                                                                                  98. A.2.4.4.2               DELETED
                                                                                                                  99. A.2.4.4.3               A refresh can be forced to happen for a node by restarting or sending a HUP signal to the MOM.
                                                                                                                  100. A.2.4.5          If the PMI power profile names are obtained from one of the vendors listed in I.2.4.3, then eoe values must be set manually on the vnodes and a submit hook needs to map the eoe values to the options listed for the PMI vendor.  The hook will set the desired job attributes for each possible eoe value.  For example:

                                                                                                                   

                                                                                                                  1.                                     # for n in node1 node2 node3 ;do
                                                                                                                  2.                         >   qmgr -c “set node $n resources_available.eoe='low,med,high'”
                                                                                                                  3.                         > done
                                                                                                                  4.                         # cat map_eoe.py
                                                                                                                  5.                         import pbs
                                                                                                                  6.                         e = pbs.event()
                                                                                                                  7.                         j = e.job
                                                                                                                  8.                         profile = j.Resource_List['eoe']
                                                                                                                  9. if profile is None:
                                                                                                                  10.     res = j.Resource_List['select']
                                                                                                                  11.     if res is not None:
                                                                                                                  12.         for s in str(res).split('+')[0].split(':'):
                                                                                                                  13.             if s[:4] == 'eoe=':
                                                                                                                  14.                 profile = s.partition('=')[2]
                                                                                                                  15.                 break
                                                                                                                  16. pbs.logmsg(pbs.LOG_DEBUG, "got profile '%s'" % str(profile))
                                                                                                                  17. if profile == "low":
                                                                                                                  18.     j.Resource_List["pstate"] = "1900000"
                                                                                                                  19.     j.Resource_List["pcap_node"] = 100
                                                                                                                  20.     pbs.logmsg(pbs.LOG_DEBUG, "set low")
                                                                                                                  21. elif profile == "med":
                                                                                                                  22.     j.Resource_List["pstate"] = "220000"
                                                                                                                  23.     j.Resource_List["pcap_node"] = 200
                                                                                                                  24.     pbs.logmsg(pbs.LOG_DEBUG, "set med")
                                                                                                                  25. elif profile == "high":
                                                                                                                  26.     j.Resource_List["pstate"] = "240000"
                                                                                                                  27.     pbs.logmsg(pbs.LOG_DEBUG, "set high")
                                                                                                                  28. else:
                                                                                                                  29.     pbs.logmsg(pbs.LOG_DEBUG, "unhandled profile '%s'" % str(profile))
                                                                                                                  30.                         e.accept()
                                                                                                                  31.                         # qmgr <<EOF
                                                                                                                  32.                         create hook power_map event=queuejob
                                                                                                                  33.                         import hook power_map application/x-python default map_eoe.py
                                                                                                                  34.                         set hook power_map enabled=True
                                                                                                                  35.                         EOF

                                                                                                                   

                                                                                                                  1. A.2.4.5.1               If settings for pstate, pgov, pcap_node, or pcap_accerator are made by the user, then the hook must be written to either overwrite or use the user values as desired.  For example, if the hook above were used as is and the user set a value for eoe to “high”and a value for pcap_node, then the pcap_node value would be in effect which would not normally happen when eoe was set to “high”.
                                                                                                                  2. A.2.4.6          When a job is run without a eoe value and power_provisioning is True, no activation is done but the resources_used.energy value for jobs will still be calculated.
                                                                                                                  3. A.2.4.7          If both aoe and eoe are set for a vnode, the eoe values must be the same for all the different application operating environments.
                                                                                                                  4. A.2.4.8          DELETED
                                                                                                                  5. A.2.5      No functionality was deprecated for this RFE.
                                                                                                                  6. A.3  User’s instructions
                                                                                                                  7. A.3.1      Submit a job which will request a specific power profile.
                                                                                                                  8. A.3.1.1          Use the provisioning feature and set “eoe” to a power profile name.  For example:
                                                                                                                  9.                                     qsub -leoe=low -lncpus=20 lackadaisical.sh
                                                                                                                  10.                                     qsub -lselect=4:eoe=high:ncpus=8 zoomjob
                                                                                                                  11. A.3.1.2          Submit a job without a value for “eoe”.  The behavior of the server and scheduler will not be changed for this case.  When this job runs, the power profile of the execution hosts may be changed depending on the implementation of the PMI.  For example, on a Cray (see I.2.4.3.1), a job can have job attributes (see I.1.6, I.1.7, I.1.8 and I.1.9) that affect the execution hosts.
                                                                                                                  12. A.3.2      The “resources_used.energy” will be set with a value provided by the PMI.  As with existing behavior, all values for “resources_used” will be written in the accounting log.
                                                                                                                  13. A.3.2.1          For example, energy could be included with resources_used for a job 'E' record:
                                                                                                                  14.                                     04/14/2014 04:42:03;E;1.x44-mpi.pbspro.com;user=ashisha group=altair project=_pbs_project_default jobname=STDIN queue=workq ctime=1397475718 qtime=1397475718 etime=1397475718 start=1397475718 exec_host=x44-mpi/0 exec_vnode=(x44-mpi:ncpus=1) Resource_List.ncpus=1 Resource_List.nodect=1 Resource_List.place=pack Resource_List.select=1:ncpus=1 session=4746 end=1397475723 Exit_status=255 resources_used.cpupercent=0 resources_used.cput=00:00:01 resources_used.mem=0kb resources_used.ncpus=1 resources_used.vmem=0kb resources_used.walltime=00:00:05 resources_used.energy=1.67 run_count=1

                                                                                                                   

                                                                                                                  1. A.3.3      Monitor power usage of a job.
                                                                                                                  2. A.3.3.1          Use qstat to see the resources_used.energy value as the job runs. 
                                                                                                                  3. A.4  Integrator’s Instructions
                                                                                                                  4. A.4.1      N/A
                                                                                                                  5. A.5  Usage Notes
                                                                                                                  6. A.5.1      N/A
                                                                                                                  7. A.6  Changes to memory usage as a result of this new feature.
                                                                                                                  8. A.6.1      TBD
                                                                                                                  9. A.7  Changes to object file size as a result of this new feature.
                                                                                                                  10. A.7.1      TBD
                                                                                                                  11. A.8  Changes to the performance of any existing commands as a result of this feature.
                                                                                                                  12. A.8.1      TBD
                                                                                                                  13. A.9  Additional notes to break the administrator's instructions into one section for SGI and one for Cray.
                                                                                                                  14. A.9.1      SGI
                                                                                                                  15. A.9.1.1          Set power_provisioning on the server to true.  For example:
                                                                                                                  16.                                     # qmgr -c “set server power_provisioning = True”
                                                                                                                  17. A.9.1.2          DELETED
                                                                                                                  18. A.9.1.3          Set power_enable to true on the desired vnodes.  For example:
                                                                                                                  19.                                     # qmgr -c “set node node1 power_enable = True”
                                                                                                                  20.                                     # qmgr -c “set node node2 power_enable = True”
                                                                                                                  21.                                     # qmgr -c “set node node3 power_enable = True”
                                                                                                                  22. A.9.1.4          Restart or HUP the MOMs.
                                                                                                                  23. A.9.1.5          Check for eoe values as described in I.2.1.6.3.
                                                                                                                  24. A.9.2      Cray
                                                                                                                  25. A.9.2.1          Set power_provisioning on the server to true and power_enable to true on the desired vnodes.  For example:
                                                                                                                  26.                                     # qmgr -c “set server power_provisioning = True”
                                                                                                                  27.                                     # qmgr -c “set node node1 power_enable = True”
                                                                                                                  28.                                     # qmgr -c “set node node2 power_enable = True”
                                                                                                                  29.                                     # qmgr -c “set node node3 power_enable = True”
                                                                                                                  30. A.9.2.2          Use qmgr to set  resources_available.eoe for vnodes.
                                                                                                                  31. A.9.2.3          Import a submit hook that will map the eoe values to job resources.  See I.2.4.5.
                                                                                                                  32. A.9.2.4          Setup RUR if desired.  See I.2.1.5.
                                                                                                                      1. PMI:activate: set myself offline


                                                                                                                  1. Interface #48
                                                                                                                    1. Visibility: Public
                                                                                                                    2. Change Control: Stable
                                                                                                                    3. Synopsis: PBS hook order support range from [-1000, 2000].
                                                                                                                    4. Reference to more detail on the interface.
                                                                                                                      1. Example

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        # qmgr -c “set pbshook power_hook order = -1000”

                                                                                                                        # qmgr -c “set pbshook power_hook order = 2000”


                                                                                                                  B. Administrator’s instructions

                                                                                                                  1. Installation does not require any new or different steps.  Once the PBS installation is complete, additional steps will be needed to enable power functionality.
                                                                                                                    1. Set power_provisioning on the server to true, and power_provisioning to true on the desired vnodes.  For example:

                                                                                                                      Info
                                                                                                                      iconfalse

                                                                                                                      # qmgr -c “set server power_provisioning = True”

                                                                                                                      # qmgr -c “set node node1 power_provisioning = True”

                                                                                                                      # qmgr -c “set node node2 power_provisioning = True”

                                                                                                                      # qmgr -c “set node node3 power_provisioning = True”


                                                                                                                    2. If all vnodes will have power_provisioning set, @default can be used instead of individual vnode names.  For example:

                                                                                                                      Info
                                                                                                                      iconfalse

                                                                                                                       # qmgr -c “set node @default power_provisioning = True”


                                                                                                                    3. If eoe values are not provided by the PMI, additional steps are needed.
                                                                                                                      1. Use qmgr to set  resources_available.eoe for vnodes.
                                                                                                                      2. Import a submit hook that will map the eoe values to job resources.  See B.4.e.
                                                                                                                    4. To disable power provisioning, set  power_provisioning to false.
                                                                                                                      1. If power_provisioning is set to false while jobs are running, the running jobs would not have their profile deactivated when they finished and the resources_used.energy value would not be set or updated at the end of the job. If periodic hook event is already run at least once before setting provisioning to False, resources_used.energy will have a value but it would not be an accurate one.
                                                                                                                    5. To disable power provisioning on selected vnodes, set power_provisioning on the vnodes to False.
                                                                                                                      1. If power_provisioning is set to false on the mother superior while jobs are running, the running jobs would not have their profile deactivated when they finished and the resources_used.energy value would not be set or updated at the end of the job. If periodic hook event is already run at least once before setting power_provisioning to False, resources_used.energy will have a value but it would not be an accurate one.
                                                                                                                    6. An optional step on a Cray system can be preformed to use the RUR system to obtain the energy used by each aprun.  Please see the Cray document “Managing System Software for the Cray® Linux Environment” http://docs.cray.com/books/S-2393-5101/S-2393-5101.pdf.  Chapter 12 will provide information on RUR.
                                                                                                                      1. The RUR config file has to be modified to use the PBS output plugin:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        /opt/pbs/default/lib/cray/pbs_output.py    


                                                                                                                    7. If eoe values are provided by the PMI, additional steps are needed to allow the server and MOMs to communicate the eoe values.
                                                                                                                      1. After setting power_provisioning on the desired vnodes, restart or HUP the MOMs.
                                                                                                                      2. Check that resources_available.eoe values are set on the vnodes.  It may take up to two minutes for the eoe values to be reported to the server.  If any vnodes do not report eoe values, restart or HUP the MOMs a second time for the vnodes missing eoe values.
                                                                                                                  2. An upgrade may require some additional steps.
                                                                                                                    1. If a job prologue script is defined as described in the PBS  Professional Administrator's Guide section 12.4.4, this must be converted into an execjob_prologue hook before power provisioning can be enabled.  A prologue script will no longer run after power_provisioning is enabled.
                                                                                                                  3. If any host is running PBS with an alternate location for the pbs.conf file, PBS_CONF_FILE must be added to the pbs_environment file on that host.  On Linux systems, the default location for the pbs.conf file is /etc/pbs.conf.  The pbs.conf file is used by each MOM to check if the server or scheduler is running on the local host.  If so, the node will not be automatically configured for power provisioning.  For example, if /var/pbs.conf is the active pbs.conf file, the following line must be added to PBS_HOME/pbs_environment:

                                                                                                                    Info
                                                                                                                    iconfalse

                                                                                                                    PBS_CONF_FILE=/var/pbs.conf


                                                                                                                  4. New behavior
                                                                                                                    1. When the  power_provisioning server attribute is set to True, the PBS MOM will detect and use the PMI on the system where it is running.  The PMIs supported are:
                                                                                                                      1. SGI HPE Event Driven Framework part of HPE Management software
                                                                                                                      2. CRAY capmc on XC30 hardware platforms with SMW software release 7.0.UP03 and later.
                                                                                                                    2. The instructions to add or change power profile information are provided by the PMI provider if they are supported.  Here is a list of PMI vendors that support named power profiles.
                                                                                                                      1. SGI HPE
                                                                                                                    3. If the PMI does not support named power profiles, the resources_available.eoe should be set manually for all the nodes to give a list power profiles.  The eoe values will be mapped to a set of options that are specific to the PMI (see B.4.e).  Here is a list of PMI providers that do not support named power profiles.  The options available for each provider follow their name.
                                                                                                                      1. CRAY
                                                                                                                        1. pstate: a value for the ALPS reservation p-state.
                                                                                                                        2. pgov: a value for the ALPS reservation p-governor value.
                                                                                                                        3. pcap_node: a power cap value for each job node in watts.
                                                                                                                        4. pcap_accelerator: a power cap value for node accelerator.
                                                                                                                    4. If the PMI power profile names are obtained from one of the vendors listed in B.4.b, then the resources_available.eoe values will be set automatically when power_provisioning is True.
                                                                                                                      1. This will occur when MOM starts.
                                                                                                                      2. A refresh can be forced to happen for a node by restarting or sending a HUP signal to the MOM.
                                                                                                                    5. If the PMI power profile names are obtained from one of the vendors listed in B.4.c, then eoe values must be set manually on the vnodes and a submit hook needs to map the eoe values to the options listed for the PMI vendor.  The hook will set the desired job attributes for each possible eoe value.  For example:

                                                                                                                      Info
                                                                                                                      iconfalse

                                                                                                                      # for n in node1 node2 node3 ;do

                                                                                                                      >   qmgr -c “set node $n resources_available.eoe='low,med,high'”

                                                                                                                      > done


                                                                                                                      # cat map_eoe.py

                                                                                                                      import pbs

                                                                                                                      e = pbs.event()

                                                                                                                      j = e.job

                                                                                                                      profile = j.Resource_List['eoe']

                                                                                                                      if profile is None:

                                                                                                                      res = j.Resource_List['select']

                                                                                                                          if res is not None:

                                                                                                                              for s in str(res).split('+')[0].split(':'):

                                                                                                                                  if s[:4] == 'eoe=':

                                                                                                                                      profile = s.partition('=')[2]

                                                                                                                                      break

                                                                                                                      pbs.logmsg(pbs.LOG_DEBUG, "got profile '%s'" % str(profile))

                                                                                                                      if profile == "low":

                                                                                                                          j.Resource_List["pstate"] = "1900000"

                                                                                                                          j.Resource_List["pcap_node"] = 100

                                                                                                                          pbs.logmsg(pbs.LOG_DEBUG, "set low")

                                                                                                                      elif profile == "med":

                                                                                                                          j.Resource_List["pstate"] = "220000"

                                                                                                                          j.Resource_List["pcap_node"] = 200

                                                                                                                          pbs.logmsg(pbs.LOG_DEBUG, "set med")

                                                                                                                      elif profile == "high":

                                                                                                                          j.Resource_List["pstate"] = "240000"

                                                                                                                          pbs.logmsg(pbs.LOG_DEBUG, "set high")

                                                                                                                      else:

                                                                                                                          pbs.logmsg(pbs.LOG_DEBUG, "unhandled profile '%s'" % str(profile))

                                                                                                                      e.accept()


                                                                                                                      # qmgr <<EOF

                                                                                                                      create hook power_map event=queuejob

                                                                                                                      import hook power_map application/x-python default map_eoe.py

                                                                                                                      set hook power_map enabled=True

                                                                                                                      EOF


                                                                                                                      1.  If settings for pstate, pgov, pcap_node, or pcap_accerator are made by the user, then the hook must be written to either overwrite or use the user values as desired.  For example, if the hook above were used as is and the user set a value for eoe to “high”and a value for pcap_node, then the pcap_node value would be in effect which would not normally happen when eoe was set to “high”.
                                                                                                                    6. When a job is run without a eoe value and power_provisioning is True, no activation is done but the resources_used.energy value for jobs will still be calculated.
                                                                                                                    7. If both aoe and eoe are set for a vnode, the eoe values must be the same for all the different application operating environments.
                                                                                                                  5. No functionality was deprecated for this RFE.
                                                                                                                  6. Additional notes to break the administrator's instructions into one section for SGI HPE and one for Cray.
                                                                                                                    1. SGI HPE
                                                                                                                      1. Set power_provisioning on the server to true.  For example:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                           # qmgr -c “set server power_provisioning = True”


                                                                                                                      2. Set power_provisioning to true on the desired vnodes.  For example:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        # qmgr -c “set node node1 power_provisioning = True”

                                                                                                                        # qmgr -c “set node node2 power_provisioning = True”

                                                                                                                        # qmgr -c “set node node3 power_provisioning = True”


                                                                                                                      3. Restart or HUP the MOMs.
                                                                                                                      4. Check for eoe values as described in B.1.g.ii.
                                                                                                                    2. Cray
                                                                                                                      1. Set power_provisioning on the server to true and power_provisioning to true on the desired vnodes.  For example:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        # qmgr -c “set server power_provisioning = True”

                                                                                                                        # qmgr -c “set node node1 power_provisioning = True”

                                                                                                                        # qmgr -c “set node node2 power_provisioning = True”

                                                                                                                         # qmgr -c “set node node3 power_provisioning = True”


                                                                                                                      2. Use qmgr to set  resources_available.eoe for vnodes.
                                                                                                                      3. Import a submit hook that will map the eoe values to job resources.  See B.4.e.
                                                                                                                      4. Setup RUR if desired.  See B.1.f.


                                                                                                                  C. User’s instructions

                                                                                                                  1. Submit a job which will request a specific power profile.
                                                                                                                    1. Use the provisioning feature and set “eoe” to a power profile name.  For example:

                                                                                                                      Info
                                                                                                                      iconfalse

                                                                                                                      qsub -leoe=low -lncpus=20 lackadaisical.sh

                                                                                                                      qsub -lselect=4:eoe=high:ncpus=8 zoomjob


                                                                                                                    2. Submit a job without a value for “eoe”.  The behavior of the server and scheduler will not be changed for this case.  When this job runs, the power profile of the execution hosts may be changed depending on the implementation of the PMI.  For example, on a Cray (see B.4.c.1), a job can have job attributes (see A.6, A.7, A.8 and A.9) that affect the execution hosts.
                                                                                                                  2. The “resources_used.energy” will be set with a value provided by the PMI.  As with existing behavior, all values for “resources_used” will be written in the accounting log.
                                                                                                                    1. For example, energy could be included with resources_used for a job 'E' record:

                                                                                                                      Info
                                                                                                                      iconfalse

                                                                                                                       04/14/2014 04:42:03;E;1.x44-mpi.pbspro.com;user=ashisha group=altair project=_pbs_project_default jobname=STDIN queue=workq ctime=1397475718 qtime=1397475718 etime=1397475718 start=1397475718 exec_host=x44-mpi/0 exec_vnode=(x44-mpi:ncpus=1) Resource_List.ncpus=1 Resource_List.nodect=1 Resource_List.place=pack Resource_List.select=1:ncpus=1 session=4746 end=1397475723 Exit_status=255 resources_used.cpupercent=0 resources_used.cput=00:00:01 resources_used.mem=0kb resources_used.ncpus=1 resources_used.vmem=0kb resources_used.walltime=00:00:05 resources_used.energy=1.67 run_count=1

                                                                                                                                     

                                                                                                                  3. Monitor power usage of a job.
                                                                                                                    1. Use qstat to see the resources_used.energy value as the job runs. 


                                                                                                                  D. Internal Design Interfaces

                                                                                                                  1. Interface #1
                                                                                                                    1. Visibility: Public

                                                                                                                    2. Change Control: Stable

                                                                                                                    3. Synopsis: PBS hook power control module

                                                                                                                      1. A new class “pbs.Power” will be made available that will provide power functionality.  A hook will be able to access it via python import.

                                                                                                                    4. Reference to more detail on the interface. The following define the PMI operations available:

                                                                                                                      1. activate_profile(self, profile_name,  job)

                                                                                                                        1. Activate a given power profile on a set of hosts on behalf of a given job.  The parameter “profile_name” is a string containing the name of a profile.  The parameter “job” is a PBS job object.  The hosts will be calculated from the job object.  If the job parameter is not specified, the pbs.event().job object will be used.

                                                                                                                        2. The return type is bool where True indicates success and False indicates the request was made without an indication from the PMI if it was successful or not.

                                                                                                                        3. If an error occurs where it is appropriate for some or all of the job vnodes to be marked offline, this may be done before an exception is raised.

                                                                                                                        4. If an error occurs where it is appropriate for the supported profile names for some or all of the job vnodes to be refreshed, this may be done before an exception is raised.

                                                                                                                      2. get_usage(self, job)

                                                                                                                        1. Retrieve power usage for a job.  The parameter “job” is a PBS job object.

                                                                                                                        2. The return will be a float which gives the cumulative energy usage for the job at the time of the call in kilowatt-hours (kWh).  If no power usage information is available, None is returned.

                                                                                                                      3. deactivate_profile(self, job)

                                                                                                                        1. Inform the PMI that a job is no longer active.  This would be used when a job is suspended or terminated.  The parameter “job” is a PBS job object.  If it is not specified, the pbs.event().job object will be used.

                                                                                                                        2. The return type is bool where True indicates success and False indicates the request was made without an indication from the PMI if it was successful or not.

                                                                                                                      4. query(self, query_type)

                                                                                                                        1. Return information that matches a request type.  The parameter “query_type” is used to specify what should be returned.  The only value for  query_type is QUERY_PROFILE, and the return will be a list of strings giving profile names supported by the PMI.

                                                                                                                      5. connect(self, endpoint, port)

                                                                                                                        1. Connect to the PMI.  The parameter “endpoint” defaults to None and is a string which will be meaningful to the PMI.  The parameter “port” defaults to None and is an integer.  A typical usage would be “endpoint” specifying a hostname and “port” giving a network port for a network service connection.

                                                                                                                        2. Currently the connection/disconnection will be done per hook instead of creating a long lasting session.

                                                                                                                        3. Nothing is returned, the connection information is maintained in an instantiation of the Power class.

                                                                                                                        4. If the endpoint or port parameters are not specified, the underlying code specific to the PMI will determine the connection details.

                                                                                                                      6. disconnect(self)

                                                                                                                        1. Disconnect from the PMI.  There are no parameters needed since each instance of the Power class is associated to a backend power management interface.

                                                                                                                      7. Exceptions

                                                                                                                        1. InternalError - returned in cases where the underlying cause of a failure cannot be determined.

                                                                                                                        2. BackendError - the backend PMI call was unsuccessful.

                                                                                                                      8. Power module initialization

                                                                                                                        1. A string can optionally be passed to specify the name of the PMI to be used (see D.2).  By default, the type of PMI to be used will be determined automatically based on the type of hardware used.

                                                                                                                    5. Examples

                                                                                                                      1. Activate a profile from a job specific event.

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        p = pbs.Power()

                                                                                                                        p.connect(“power_master”)

                                                                                                                        p.activate_profile(“LOW”)

                                                                                                                        p.disconnect()


                                                                                                                      2. Get profile name list.

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        import pbs

                                                                                                                        p = pbs.Power()

                                                                                                                        p.connect(port=3564)

                                                                                                                        pnames = p.query(p.QUERY_PROFILE)

                                                                                                                        p.disconnect()


                                                                                                                      3. Deactivate profile on a specific job.

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        import pbs

                                                                                                                        p = pbs.Power()

                                                                                                                        badjob = pbs.server().job(“10”)

                                                                                                                        p.connect()

                                                                                                                        p.deactivate_profile(job=badjob)

                                                                                                                        p.disconnect()


                                                                                                                  2. Interface #2
                                                                                                                    1. Visibility: Public
                                                                                                                    2. Change Control: Unstable
                                                                                                                    3. Synopsis: Expose the hook PMI structure to allow additions to the supported PMI list.
                                                                                                                    4. Reference to more detail on the interface.
                                                                                                                      1. The PBS “power” hook can be modified to specify a PMI name in the pbs.Power() instantiation in the init_power function.  For example, the code below would cause the new file described in D.2.d.ii to be used by the hook:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        power = pbs.Power(“ipmitool”)


                                                                                                                      2. Python code patterned after the file PBS_EXEC/lib/python/altair/pbs/v1/_pmi_none.py must be placed in a file where none is replaced by the PMI name being implemented.  For example:

                                                                                                                        Info
                                                                                                                        iconfalse

                                                                                                                        # cd $PBS_EXEC/lib/python/altair/pbs/v1

                                                                                                                        # cp _pmi_none.py _pmi_ipmitool.py

                                                                                                                        # vi _pmi_ipmitool.py


                                                                                                                      3. The defined functions must all be present: __init__, _connect, _disconnect, _get_usage, _query, _activate_profile, _deactivate_profile.  These all have the same arguments as those in I.1.1 except the function name has an intial underbar ('_').