Uploaded image for project: 'PBS Pro'
  1. PP-479

As an admin, I would like running subjobs to be able to survive a pbs_server restart, so that the work up to that point is not lost


    • Type: User Story
    • Status: Resolved
    • Priority: Low
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 18.1.1
    • Component/s: None
    • Labels:


      Currently on server restart, job arrays that have running sub jobs are terminated due to them only being stored in memory. I would like for this behavior to be changed so that running subjobs continue to run after a server restart. It would also be great if we could store the information that is unique to each subjob such as run_count, resources_used, comments, etc so that a queue of the sub jobs does not return just the parent information once the job is finished.

      Also please consider

      Some of the attributes that we should consider making unique are
      resources_used.cpupercent = 1716
      resources_used.cput = 26:01:06
      resources_used.mem = 34733656kb
      resources_used.ncpus = 15
      resources_used.vmem = 34733656kb
      resources_used.walltime = 01:54:51
      Error_Path =
      exec_host =
      exec_vnode =
      mtime = Wed Nov 8 20:31:47 2017
      Output_Path =
      stime = Wed Nov 8 18:36:56 2017
      session_id = 27561
      substate = 42
      comment = Job run at Wed Nov 08 at 18:36 on
      etime = Wed Nov 8 18:36:56 2017
      run_count = 1
      array_index =


          Issue links



              • Assignee:
                shrinivas.harapanahalli Shrinivas Harapanahalli
                scc Scott Campbell
              • Votes:
                0 Vote for this issue
                4 Start watching this issue


                • Created: