Uploaded image for project: 'PBS Pro'
  1. PBS Pro
  2. PP-783

race condition in mom hook transport initiated by provisioning


    • Type: Bug
    • Status: In Progress
    • Priority: Low
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: 17.1.1
    • Component/s: None
    • Labels:
    • Story Points:
    • Acceptance Criteria:
      Mom hooks are sync'd by server when a mom comes up after a node is provisioned


      When a provisioning node needs to be rebooted the script initiates the reboot then immediately returns success. This works great, but PBS does something silly here. In start_vnode_provisioning() the call to execute_python_prov_script() is made, and if the return value is success the pbs_server child that is running start_vnode_provisioning() goes on to send mom hooks to the node that is now in the act of rebooting. It appears this doesn't need to happen, getting rid of it using making this change seems to work as expected, and if the new pbs_mom instance at reboot needs an update to mom hooks the pbs_server provides them as usual for new pbs_mom instances.:

      5410,5415 ****
      5382,5388 ----

      /* exit with the return code from the script */
      rc = execute_python_prov_script(phook, prov_vnode_info);
      + #ifndef NAS /* localmod 156 */
      if ((rc == 0) && (mom_hooks_seen_count() > 0))

      { int ret; /* Point path_hooks_tracking file to some private */ *************** *** 5475,5480 **** --- 5448,5454 ---- }

      + #endif /* localmod 156 */
      /* if python did sys.exit we wont be here */


          Issue links



              • Assignee:
                riyazhakki Mohammad Riyaz M Hakki
                smgoosen Sam Goosen
              • Votes:
                0 Vote for this issue
                2 Start watching this issue


                • Created: