Uploaded image for project: 'PBS Pro'
  1. PP-881

PTL framework is not consistent while returning PBS error code through PbsManagerError exception


    • Type: Bug
    • Status: Open
    • Priority: High
    • Resolution: Unresolved
    • Affects Version/s: 18.1.1
    • Fix Version/s: None
    • Component/s: PTL Framework
    • Labels:
    • Severity:
    • Sprint:
    • Story Points:


      PTL is not throwing the right PBS return code in case of any PBS command failure while raising the exception PbsManagerError.
      Following is analysis on the same.
      It looks like PTL is using Popen to execute the respective qmgr command in CLI mode. Given below is snapshot of code for the same.
      p = Popen(runcmd, bufsize=-1, stdin=stdin, stdout=stdout, stderr=stderr, cwd=cwd, env=env)
      except Exception, e:
      Upon Popen execution it is getting the return code from the following statements.
      (o, e) = p.communicate(input)
      ret['rc'] = p.returncode
      This p.returncode is nothing but the return code of the process that is forked which is qmgr in this case. And it is not returning the PBS error code. Following is an example for the same.
      [root@stblr3 tests]# qmgr -c "s q q5 started=t"
      qmgr obj=q5 svr=default: Unknown queue
      qmgr: Error (15018) returned from server
      [root@stblr3 tests]# echo $?
      [root@stblr3 tests]# qmgr -c 'create queue Q1 started=False,queue_type=route,partition=P1,enabled=False'
      qmgr obj=Q1 svr=default: Can not assign a partition to route queue
      qmgr: Error (15217) returned from server
      [root@stblr3 tests]# echo $?
      As per our analysis in the above case e.rc is coming as 113 which is indeed not a PBS error code.
      Following is further analysis on this.
      The return code from qmgr is manipulated by the underlying shell in case of LINUX operating system i.e. even though we return 15217 from qmgr it is converted to (15217 % 256 = 113) and hence we are seeing the above error code. But in case of Windows it might return the exact error code i.e. 15217
      It would always better for PTL to parse the error message from the object e of the statement
      (o, e) = p.communicate(input) and then fill the rc with actual value since we i.e. PTL itself is raising an exception. This way we can have consistency both in LINUX and non LINUX environments. It is not good to blindly assume that qmgr always returns PBS error code.


          Issue links



              • Assignee:
                Suresh-thelkar Suresh Thelkar
              • Votes:
                0 Vote for this issue
                7 Start watching this issue


                • Created: