Skip to end of metadata
Go to start of metadata

Interface 1: new option to output (stdout/stderr) files go to the final destination, instead of being staged, if the final destination is known to be writable from the job execution node.

  • Visibility: Public
  • Change Control: Stable
  • Details: A user can have the option to have their job’s output (.o and .e) files to be written to the final destination, if the file system is available from mother superior, instead of being staged.
  • "d" modifier can be used with existing "qsub -k" option. (Ex. qsub -k oed)
  • The phrase "known to be writable" mean "the files ultimate destination host:path is mapped from the primary execution node via the existing $usecp directive in mom config".
  • The job's Output_path and Error_path are settable with the -o and -e options, and will be honored if the "d" modifier is used for the corresponding file.
  • The admin can make this behavior as default by using "default_qsub_arguments = -koed".
  • If the d modifier for -k is used but the specified file's final destination path(s) are NOT usecp-able, the mom should log a warning and continue running the job with normal spooling and staging to the final destination.

Interface 2: A user shall be able to provide an option at job submission time to have PBS remove the output files (.o and .e) for that job, if it completes successfully.

  • Visibility: Public
  • Change Control: Stable
  • Details: Introduce new "d" option for qsub which means "remove upon job completion".
  • "job completion" means terminated with no errors.
  • qsub -d oe job.sh
  • The admin can make this behavior as default by using "default_qsub_arguments = -doe".
  • The user has the choice to tell which files has to be deleted. (.e or .o or both)
  • When sandbox is used with -k option, it will delete .o and .e files from hosts. Going forward, deletion will happen only by using "-d" option, not by "-k" option as It is not expected to delete using "-k" option when it stands for "Keep_Files".


Examples:

qsub -koed 
Means direct write both the job's output and error files to the Output_path and Error_path if host:path is usecp-able from the primary exec host.  If they are not, issue a warning in mom log and do normal spooling and staging.

qsub -kod 
Means direct write the job's output file to the Output_path if host:path is usecp-able from the primary exec host.  If it is not, issue a warning in mom log and do normal spooling and staging of the output file.  The job's error file will be spooled in $PBS_HOME/spool and staged to Error_path per existing behavior since nothing concerning it was specified.

qsub -ked 
Means direct write the job's error file to the Error_path if host:path is usecp-able from the primary exec host.  If it is not, issue a warning in mom log and do normal spooling and staging of the error file.  The job's output file will be spooled and staged per existing behavior since nothing concerning it was specified.

qsub -doe -koe
Means direct write both files to user's local home directory (does not matter if it is usecp-able in this case, this is existing -koe functionality), then remove both files upon successful job completion.

qsub -koed -doe
Means direct write both the job's output and error files to the Output_path and Error_path if host:path is usecp-able from the primary exec host.  If they are not, issue a warning in mom log and do normal spooling. When the job completes successfully, remove the output and error files either from their directly written location or from $PBS_HOME/spool.  If the job is unsuccessful, leave the files in place or stage them to Output_path and Error_path if they were spooled.

qsub -keo -de
Means write both files to user's local home directory (does not matter if it is usecp-able in this case, this is existing -koe functionality), then remove only the error file upon successful job completion.

qsub -koed -de

 Means both output and error files are directly written to the Output_path and Error_path if host:path is usecp-able from the primary exec host.  The error file will be removed upon successful job completion, the output file is retained.

qsub -Wsandbox=PRIVATE -koed

Means the output files will be written to sandbox and will be staged into submission directory. The staged files will not get deleted. Direct output is not used when sandbox is used. (similar to existing -k behavior with sandbox)

qsub -Wsandbox=PRIVATE -koed -doe

Means the output files will be written to sandbox and will get deleted.

Community discussion

  • No labels