Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: hook is enabled on Cray X* series

...

Overview:
PBS and ALPS can sometimes get out of sync. The purpose of the synchronization hook is to check to see if the information
that PBS has is out of sync with what ALPS is reporting. When the hook detects that PBS and ALPS are out of sync, the hook 
will HUP the Mom. The hook will only do its work on Cray X-series Moms.


Interface 1: PBS hook PBS_alps_inventory_check

  • Visibility: Public
  • Change Control: Experimental
  • Details: 
    • This is a periodic hook that runs on the execution host.
    • The Hook is not enabled by default. It enabled by default when run on a Cray X* series machine.
      • The hook is disabled by default on all other platforms.
    • The hook runs as the Administrator and executes every 300 seconds.
    • The timeout for the Hook is 90 seconds.
        

...

  • Visibility: PBS Private
  • Change Control: Experimental
  • Details: 
    • The mom installed on a login node reports inventory; additional moms, if any, do not.
    • The first instance of 'name' is the hostname of the login node responsible for performing the inventory query. The second 
      instance of 'name' is the hostname of the current/local mom.
    • Log level: PBSEVENT_ADMIN.

...

Interface 10: Mom log entry: ALPS Inventory Check: Compute node <list of nodes> (s) defined in ALPS, but not in PBS: <list of nodes>

...

Interface 11: Mom log entry: ALPS Inventory Check: Compute node <list of nodes> (s) defined in PBS, but not in ALPS: <list of nodes>

...

  • Visibility: PBS Private
  • Change Control: Experimental
  • Details: 
    • Recorded when the Hook is unable to restart HUP the Mom and successfully refresh nodes.
    • Log level: PBSEVENT_ADMIN.