2018-10-10 08:24:33,800 INFO input command: pbs_benchpress -p nomom=server,moms=mom@/etc/pbs.conf,momtype=mom@cpuset -t TestSchedSubjobBadstate -o /tmp/TestSchedSubjobBadstate_with_other_fixes.txt 2018-10-10 08:24:33,809 INFO param: nomom=server,moms=mom@/etc/pbs.conf,momtype=mom@cpuset 2018-10-10 08:24:33,813 INFO ptl version: 19.2.0 2018-10-10 08:24:33,817 INFO platform: Linux server 3.10.0-693.21.1.el7.x86_64 #1 SMP Fri Feb 23 18:54:16 UTC 2018 x86_64 x86_64 2018-10-10 08:24:33,821 INFO python version: 2.7.13 2018-10-10 08:24:33,825 INFO user: root 2018-10-10 08:24:33,829 INFO -------------------------------------------------------------------------------- 2018-10-10 08:24:33,833 INFO Cleaning up temporary files 2018-10-10 08:24:33,845 INFO Cleaning up /var/tmp dir 2018-10-10 08:24:33,852 INFO Cleaning up /tmp dir 2018-10-10 08:25:05,677 INFO ====================================================================== 2018-10-10 08:25:05,683 INFO suite name: TestSchedSubjobBadstate 2018-10-10 08:25:05,686 INFO ====================================================================== 2018-10-10 08:25:05,691 INFO =========================================== 2018-10-10 08:25:05,695 INFO Entered TestSchedSubjobBadstate setUpClass 2018-10-10 08:25:05,699 INFO =========================================== 2018-10-10 08:25:05,705 INFOCLI2 server: id pbsuser 2018-10-10 08:25:05,790 INFOCLI2 server: id pbsuser1 2018-10-10 08:25:05,874 INFOCLI2 server: id pbsuser2 2018-10-10 08:25:05,954 INFOCLI2 server: id pbsuser3 2018-10-10 08:25:06,054 INFO FQDN name server.ib0.smc-default.chf.rdlabs.hpecorp.net differs from name provided server 2018-10-10 08:25:06,345 INFO server server: server operating mode set to cli 2018-10-10 08:25:06,354 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:06,666 INFO server server: version 19.2.0 2018-10-10 08:25:06,673 INFO expect action: created new action kicksched 2018-10-10 08:25:06,678 INFO expect action: added action kicksched to server server 2018-10-10 08:25:06,686 INFO FQDN name server.ib0.smc-default.chf.rdlabs.hpecorp.net differs from name provided server 2018-10-10 08:25:06,807 INFO FQDN name server.ib0.smc-default.chf.rdlabs.hpecorp.net differs from name provided server 2018-10-10 08:25:07,196 INFOCLI2 server: sudo -H /opt/pbs/sbin/pbsfs 2018-10-10 08:25:07,578 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/resource_group 2018-10-10 08:25:07,801 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/holidays 2018-10-10 08:25:08,029 INFOCLI server: /opt/pbs/bin/qmgr -c list sched default 2018-10-10 08:25:08,705 INFOCLI2 server: sudo -H /opt/pbs/sbin/pbsfs 2018-10-10 08:25:09,095 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/resource_group 2018-10-10 08:25:09,320 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/holidays 2018-10-10 08:25:09,557 INFO FQDN name mom.ib0.smc-default.chf.rdlabs.hpecorp.net differs from name provided mom 2018-10-10 08:25:19,966 INFO ============================================ 2018-10-10 08:25:19,973 INFO Completed TestSchedSubjobBadstate setUpClass 2018-10-10 08:25:19,977 INFO ============================================ 2018-10-10 08:25:19,990 INFO test name: test_sched_badstate_subjob (tests.functional.pbs_sched_subjob_badstate.TestSchedSubjobBadstate)... 2018-10-10 08:25:19,996 INFO test start time: Wed Oct 10 08:25:19 2018 2018-10-10 08:25:20,001 INFO test docstring: This test case tests if scheduler goes into infinite loop when following conditions are met. - Kill a mom - mark the mom's state as free - submit an array job - check the sched log for "Leaving sched cycle" from the time array job was submitted. If we are unable to find a log match then scheduler is in endless loop and test case has failed. 2018-10-10 08:25:20,008 INFO ====================================== 2018-10-10 08:25:20,015 INFO Entered TestSchedSubjobBadstate setUp 2018-10-10 08:25:20,020 INFO ====================================== 2018-10-10 08:25:20,028 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:20,344 INFO status on server: server 2018-10-10 08:25:20,354 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:20,673 INFO manager on server: unset server managers 2018-10-10 08:25:20,684 INFOCLI server: sudo -H /opt/pbs/bin/qmgr -c unset server managers 2018-10-10 08:25:21,211 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:21,518 INFO expect on server server: managers unset server server.ib0.smc-default.chf.rdlabs.hpecorp.net ... OK 2018-10-10 08:25:21,527 INFO manager on server: set server {'managers': (2, 'root@*')} 2018-10-10 08:25:21,534 INFOCLI server: sudo -H /opt/pbs/bin/qmgr -c set server managers+=root@* 2018-10-10 08:25:22,037 INFO server server: reverting configuration to defaults 2018-10-10 08:25:22,049 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:22,355 INFO select on server: __ALL__ 2018-10-10 08:25:22,364 INFOCLI server: /opt/pbs/bin/qselect 2018-10-10 08:25:22,654 INFO delete job on server: 4[].server 2018-10-10 08:25:22,664 INFOCLI server: /opt/pbs/bin/qdel -W force 4[].server 2018-10-10 08:25:23,009 INFOCLI server: /opt/pbs/bin/qstat -f 4[].server 2018-10-10 08:25:23,398 INFOCLI server: /opt/pbs/bin/qstat -f @server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:23,701 INFO expect on server server: job_state set 0 job ... OK 2018-10-10 08:25:23,711 INFOCLI server: /opt/pbs/bin/pbs_rstat -f 2018-10-10 08:25:24,000 INFO manager on server: unset server ['pnames', 'pbs_license_max', 'pbs_license_min'] 2018-10-10 08:25:24,010 INFOCLI server: /opt/pbs/bin/qmgr -c unset server pnames,pbs_license_max,pbs_license_min 2018-10-10 08:25:24,361 INFOCLI server: sudo -H /opt/pbs/bin/qmgr -c list hook 2018-10-10 08:25:24,818 INFOCLI server: /opt/pbs/bin/qstat -Qf @server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:25,117 INFO status on server: node 2018-10-10 08:25:25,126 INFOCLI server: /opt/pbs/bin/pbsnodes -s server.ib0.smc-default.chf.rdlabs.hpecorp.net -v -a 2018-10-10 08:25:25,563 INFO manager on server: delete queue workq 2018-10-10 08:25:25,571 INFOCLI server: /opt/pbs/bin/qmgr -c delete queue workq 2018-10-10 08:25:25,887 INFO server server: expect offset set to 0.5 2018-10-10 08:25:26,399 INFOCLI server: /opt/pbs/bin/qstat -Qf workq@server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:26,688 INFO expect on server server: unset queue workq ... OK 2018-10-10 08:25:26,698 INFO manager on server: create queue workq {'started': 'True', 'queue_type': 'Execution', 'enabled': 'True'} 2018-10-10 08:25:26,706 INFOCLI server: /opt/pbs/bin/qmgr -c create queue workq started=True,queue_type=Execution,enabled=True 2018-10-10 08:25:27,025 INFO status on server: queue workq 2018-10-10 08:25:27,035 INFOCLI server: /opt/pbs/bin/qstat -Qf workq@server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:27,329 INFO server server: expect offset set to 0.5 2018-10-10 08:25:27,840 INFOCLI server: /opt/pbs/bin/qstat -Qf workq@server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:28,128 INFO expect on server server: started set True || queue_type set Execution || enabled set True queue workq ... OK 2018-10-10 08:25:28,137 INFO manager on server: list sched 2018-10-10 08:25:28,143 INFOCLI server: /opt/pbs/bin/qmgr -c list sched 2018-10-10 08:25:28,696 INFOCLI2 server: sudo -H /opt/pbs/sbin/pbsfs 2018-10-10 08:25:29,080 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/resource_group 2018-10-10 08:25:29,311 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/holidays 2018-10-10 08:25:29,556 INFO manager on server: set server {'default_queue': 'workq'} 2018-10-10 08:25:29,567 INFOCLI server: /opt/pbs/bin/qmgr -c set server default_queue=workq 2018-10-10 08:25:29,907 INFO status on server: resource 2018-10-10 08:25:29,916 INFO manager on server: list resource 2018-10-10 08:25:29,923 INFOCLI server: /opt/pbs/bin/qmgr -c list resource 2018-10-10 08:25:30,229 INFO manager on server: delete resource router 2018-10-10 08:25:30,238 INFOCLI server: /opt/pbs/bin/qmgr -c delete resource router 2018-10-10 08:25:31,121 INFOCLI server: /opt/pbs/bin/qmgr -c list resource router 2018-10-10 08:25:31,424 INFO expect on server server: unset resource router ... OK 2018-10-10 08:25:31,434 INFOCLI status on server: server license_count 2018-10-10 08:25:31,441 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:25:31,734 INFO server: server.ib0.smc-default.chf.rdlabs.hpecorp.net licensed 2018-10-10 08:25:32,279 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/server_priv/comm.lock 2018-10-10 08:25:33,023 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/sched.lock 2018-10-10 08:25:33,243 INFO scheduler server: reverting configuration to defaults 2018-10-10 08:25:33,253 INFO manager on server: unset sched ['sched_priv', 'sched_cycle_length', 'scheduler_iteration', 'scheduling', 'sched_log'] 2018-10-10 08:25:33,260 INFOCLI server: /opt/pbs/bin/qmgr -c unset sched sched_priv,sched_cycle_length,scheduler_iteration,scheduling,sched_log 2018-10-10 08:25:33,632 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/dedicated_time 2018-10-10 08:25:33,862 INFOCLI2 server: sudo -H cmp /opt/pbs/etc/pbs_resource_group /var/spool/pbs/sched_priv/resource_group 2018-10-10 08:25:34,085 INFO scheduler server: reverting holidays file to default 2018-10-10 08:25:34,094 INFOCLI2 server: sudo -H cmp /opt/pbs/etc/pbs_holidays /var/spool/pbs/sched_priv/holidays 2018-10-10 08:25:34,310 INFOCLI2 server: sudo -H cmp /opt/pbs/etc/pbs_sched_config /var/spool/pbs/sched_priv/sched_config 2018-10-10 08:25:34,529 INFO scheduler server: sent signal -HUP 2018-10-10 08:25:34,539 INFOCLI2 server: sudo -H kill -HUP 2996 2018-10-10 08:25:34,756 INFOCLI2 server: sudo -H /opt/pbs/sbin/pbsfs -e -I default 2018-10-10 08:25:35,966 INFOCLI2 server: sudo -H /usr/bin/cat /var/spool/pbs/sched_priv/sched.lock 2018-10-10 08:25:43,058 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net /usr/bin/python -c "import sys; print sys.platform" 2018-10-10 08:25:50,932 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/cat /var/spool/pbs/mom_priv/mom.lock 2018-10-10 08:25:54,587 ERROR mom mom.ib0.smc-default.chf.rdlabs.hpecorp.net is down 2018-10-10 08:25:57,984 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H ls -l /opt/pbs/libexec/pbs_init.d 2018-10-10 08:26:01,665 INFO running init script to start pbs mom on mom.ib0.smc-default.chf.rdlabs.hpecorp.net using /etc/pbs.conf init_cmd=['sudo', 'PBS_START_MOM=1', 'PBS_START_SERVER=0', 'PBS_START_SCHED=0', 'PBS_START_COMM=0', '/opt/pbs/libexec/pbs_init.d', 'start'] 2018-10-10 08:26:01,704 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net which scp 2018-10-10 08:26:05,098 INFOCLI2 server: /usr/bin/scp -p /tmp/PtlPbsOB9KGS mom.ib0.smc-default.chf.rdlabs.hpecorp.net:/tmp/PtlPbsOB9KGS 2018-10-10 08:26:08,613 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net /tmp/PtlPbsOB9KGS Contents of /tmp/PtlPbsOB9KGS: ---------------------------------------- #!/bin/bash sudo PBS_START_MOM=1 PBS_START_SERVER=0 PBS_START_SCHED=0 PBS_START_COMM=0 /opt/pbs/libexec/pbs_init.d start ---------------------------------------- 2018-10-10 08:28:06,444 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net which rm 2018-10-10 08:28:10,020 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net /usr/bin/rm /tmp/PtlPbsOB9KGS 2018-10-10 08:28:17,455 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/cat /var/spool/pbs/mom_priv/mom.lock 2018-10-10 08:28:25,146 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/cat /var/spool/pbs/mom_priv/mom.lock 2018-10-10 08:28:28,761 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /opt/pbs/sbin/pbs_mom --version 2018-10-10 08:28:32,568 INFO mom mom@/etc/pbs.conf: reverting configuration to defaults 2018-10-10 08:28:32,580 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/rm -f /var/spool/pbs/mom_priv/epilogue 2018-10-10 08:28:36,199 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/rm -f /var/spool/pbs/mom_priv/prologue 2018-10-10 08:28:39,832 INFOCLI mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /opt/pbs/sbin/pbs_mom -s list 2018-10-10 08:29:08,890 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net python -c "import tempfile;print tempfile.mkstemp('PtlPbstmpcopy')[1]" 2018-10-10 08:29:13,138 INFOCLI2 server: /usr/bin/scp /tmp/PtlPbsypuPXY mom.ib0.smc-default.chf.rdlabs.hpecorp.net:/tmp/tmpnfA1yAPtlPbstmpcopy 2018-10-10 08:29:16,681 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net which cp 2018-10-10 08:29:20,077 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/cp /tmp/tmpnfA1yAPtlPbstmpcopy /var/spool/pbs/mom_priv/config 2018-10-10 08:29:23,737 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net /usr/bin/rm /tmp/tmpnfA1yAPtlPbstmpcopy 2018-10-10 08:29:27,178 INFO mom mom@/etc/pbs.conf: sent signal -HUP 2018-10-10 08:29:27,189 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H kill -HUP 11673 2018-10-10 08:29:34,810 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H /usr/bin/cat /var/spool/pbs/mom_priv/mom.lock 2018-10-10 08:29:38,425 INFOCLI server: /opt/pbs/bin/pbsnodes -s server.ib0.smc-default.chf.rdlabs.hpecorp.net -v -a 2018-10-10 08:29:38,750 INFO status on server: node mom 2018-10-10 08:29:38,759 INFOCLI server: /opt/pbs/bin/pbsnodes -s server.ib0.smc-default.chf.rdlabs.hpecorp.net -v mom 2018-10-10 08:29:39,064 INFOCLI server: /opt/pbs/bin/pbsnodes -s server.ib0.smc-default.chf.rdlabs.hpecorp.net -v mom 2018-10-10 08:29:39,365 INFO expect on server server: state = free node mom ... OK 2018-10-10 08:29:39,373 INFO ======================================= 2018-10-10 08:29:39,378 INFO Completed TestSchedSubjobBadstate setUp 2018-10-10 08:29:39,383 INFO ======================================= 2018-10-10 08:29:39,389 INFO mom mom@/etc/pbs.conf: sent signal -KILL 2018-10-10 08:29:39,396 INFOCLI2 mom: ssh mom.ib0.smc-default.chf.rdlabs.hpecorp.net sudo -H kill -KILL 11673 2018-10-10 08:29:43,019 INFO manager on server: set node mom {'state': 'free', 'resources_available.ncpus': '2'} 2018-10-10 08:29:43,030 INFOCLI server: /opt/pbs/bin/qmgr -c set node mom state=free,resources_available.ncpus=2 2018-10-10 08:29:43,381 INFO server server: expect offset set to 0.5 2018-10-10 08:29:43,894 INFOCLI server: /opt/pbs/bin/pbsnodes -s server.ib0.smc-default.chf.rdlabs.hpecorp.net -v mom 2018-10-10 08:29:44,194 INFO expect on server server: state set free || resources_available.ncpus set 2 node mom ... OK 2018-10-10 08:29:44,203 INFO manager on server: set server {'scheduling': 'False'} 2018-10-10 08:29:44,210 INFOCLI server: /opt/pbs/bin/qmgr -c set server scheduling=False 2018-10-10 08:29:44,565 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:29:44,860 INFO expect on server server: scheduling set False && server_state set Idle server server.ib0.smc-default.chf.rdlabs.hpecorp.net ... OK 2018-10-10 08:29:44,878 INFO job: executable set to /bin/sleep with arguments: 100 2018-10-10 08:29:44,892 INFOCLI server: sudo -H -u pbsuser /opt/pbs/bin/qsub -l ncpus=2 -J 1-3 -- /bin/sleep 100 2018-10-10 08:29:45,411 INFO submit to server as pbsuser: job 5[].server OrderedDict([('Resource_List.ncpus', '2'), ('array_indices_submitted', '1-3')]) 2018-10-10 08:29:45,421 INFO manager on server: set server {'scheduling': 'True'} 2018-10-10 08:29:45,428 INFOCLI server: /opt/pbs/bin/qmgr -c set server scheduling=True 2018-10-10 08:29:45,782 INFOCLI server: /opt/pbs/bin/qstat -Bf server.ib0.smc-default.chf.rdlabs.hpecorp.net 2018-10-10 08:29:46,100 INFO expect on server server: scheduling set True server server.ib0.smc-default.chf.rdlabs.hpecorp.net ... OK 2018-10-10 08:29:46,210 INFO scheduler server log match: searching for "Leaving Scheduling Cycle" - No match 2018-10-10 08:29:47,622 INFO scheduler server log match: searching for "Leaving Scheduling Cycle"... OK 2018-10-10 08:29:47,628 INFO delete on server: 5[].server 2018-10-10 08:29:47,633 INFO delete job on server: 5[].server 2018-10-10 08:29:47,639 INFOCLI server: /opt/pbs/bin/qdel 5[].server 2018-10-10 08:29:47,983 INFO ========================================= 2018-10-10 08:29:47,991 INFO Entered TestSchedSubjobBadstate tearDown 2018-10-10 08:29:47,996 INFO ========================================= 2018-10-10 08:29:48,002 INFO ========================================== 2018-10-10 08:29:48,007 INFO Completed TestSchedSubjobBadstate tearDown 2018-10-10 08:29:48,012 INFO ========================================== 2018-10-10 08:29:48,031 INFO ok 2018-10-10 08:29:48,067 INFO ================================================================================ run: 1, succeeded: 1, failed: 0, errors: 0, skipped: 0, timedout: 0 Tests run in 0:04:42.481243 2018-10-10 08:29:48,073 INFO Cleaning up temporary files 2018-10-10 08:29:48,080 INFO Cleaning up /var/tmp dir 2018-10-10 08:29:48,088 INFO Cleaning up /tmp dir