PBS kills jobs that are suspended when the pbs_comm is restarted

Description

When restarting pbs_comm in a cluster all suspended jobs are being killed. As part of the analysis in mom_comm.c the following check was not checking jobs in suspended state.

if (pjob->ji_qs.ji_substate == JOB_SUBSTATE_PRERUN ||
pjob->ji_qs.ji_substate == JOB_SUBSTATE_RUNNING) {

Issue can be reproduced by suspending a multi-node job that spanned 2+ pbs_comm\'s (head node connected to pbs_comm A, some number of sister nodes attached to pbs_comm B) then stopping/restarting pbs_comm A.

Acceptance Criteria

None

Status

Assignee

Ram Pranesh

Reporter

Ram Pranesh

Severity

None

OS

None

Start Date

None

Pull Request URL

None

Story Points

1

Components

Affects versions

Priority

Low
Configure