moms cannot communicate with one another in a cloud configuration when cloud nodes resolve each other's hostnames to IP addresses not known to the PBS server/comm

Description

When PBS is used in a configuration where cloud nodes resolve eachother's names to one set of IP addresses but the local PBS server/comm host resolves a different set of IP addresses (through a VPN) for the same names then the moms cannot communicate with one another for multinode jobs. This is because when the server runs a job it sends exec_host2 to the primary execution host (a cloud node in this example) to communicate all of the nodes in the job, where the hostnames get resolved to the cloud addresses. When the primary execution host then tries to send messages to these addresses through the pbs_comm it is unable to as only the VPN addresses are known to the comm.

Acceptance Criteria

multinode jobs work properly when cloud nodes resolve each other's names to one set of IP addresses but the local PBS server/comm host resolves a different set of IP addresses (through a VPN).

Status

Assignee

Brem Anand J K

Reporter

Scott Campbell

Severity

3-High

OS

None

Start Date

None

Pull Request URL

None

Story Points

1

Components

Fix versions

Affects versions

14.1.0

Priority

Critical
Configure