Enhance pbs_snapshot to capture remote host data as "sub" snapshots

Motivation:

This is the next step towards making pbs_snapshot non-root user friendly. Right now, data from remote hosts can only be captured if pbs_snapshot is run as root because it directly copies files over, and if some of those files are privileged (like mom_priv/), then they need root privilege to be copied. Directly copying remote data with sudo is very tricky to do and can be error prone. So, it would be better to just run another pbs_snapshot command on each remote host, capture sub-snapshots and copy those tar files over instead.

Forum Discussion: http://community.pbspro.org/t/capture-sub-snapshots-for-pbs-snapshot-additional-hosts/1192

Interface changes:

  • Snapshot directory structure change:
    • Earlier, data for remote hosts was intermixed with data for the server node. For example: a mom_priv captured from remote host was captured as mom_priv<hostname>, mom logs from the same host were captured in a separate directory called mom_logs<hostname>, etc. Now, all of the data from a remote host will be captured as a single tar file.
    • The sub-snapshot tar files will be named by the hostnames of their respective hosts, i.e: <hostname>_snapshot.tgz
  • Primary host captured will now be local host by default: 
    • Earlier, pbs_snapshot would actually parse the pbs.conf on the local host, find the pbspro server host and capture that. Now, pbs_snapshot will capture the local host by default. The -H option should be used to point to the remote pbs server if pbs_snapshot is invoked from a client host. This is needed to prevent the child pbs_snapshot invocations from capturing data from the main pbs server host.

Algorithm changes:

  • When --additional-hosts is provided, PBSSnapUtils will launch multiple pbs_snapshot commands, one for each additional host.
  • pbs_snapshot now tries to figure out what daemon information is available on each host and captures information accordingly. e.g - for a host running PBS server and mom, it will capture both server and mom info, for a host running only mom, it will only try to capture mom info, etc.
  • As mentioned in interface changes, now pbs_snapshot doesn't read pbs.conf to figure out where PBS_SERVER host is, it relies on the user to provide it the correct primary host to capture via the -H option.
  • When pbs_snapshot's --with-sudo option is specified, the child pbs_snapshot commands for remote hosts will be invoked with sudo.








OSS Site Map

Project Documentation Main Page

Developer Guide Pages