Building and Installing PBS 19 and OpenMPI: Tutorial

Notes

  • Building the PTL RPM (for testing) is optional
  • You may want to use the 18.1.3 release rather than the 19.1.1 beta. An issue was found with the beta where the database did not start and will be fixed prior to the official release.

Tutorial

Overview
========
This tutorial will accomplish three things:
1. Build PBS Professional (OSS release 19.1.1 beta 1 for this example)
2. Build OpenMPI with support for PBS Professional task manager interface
3. Build and run some sample MPI applications 

OpenMPI will be installed under /opt/openmpi
PBS Pro will be installed under /opt/pbs
PTL will be installed inder /opt/ptl

Prerequisites
=============
- Two VMs with two virtual CPUs each (pbs-server and mom-2 for this example)
- Root access on both VMs (needed for installing PBS Pro and OpenMPI)
- /opt does not squash UID for SUID binaries
- VMs configured to communicate with each other
- Same OS on both VMs (to prevent building everything twice)
- Intenet access to download source code
- Build dependencies for PBS Pro and OpenMPI are installed on primary VM
- Installation dependencies for PBS Pro and OpenMPI are installed on both VMs

Setup
=====
Any existing PBS Pro or OpenMPI packages should be uninstalled.
$ rpm -qa | grep pbs
$ rpm -qa | grep openmpi

Use yum, zypper, apt-get, etc. to uninstall these packages. Check the
contents of /opt to ensure the pbs and openmpi directories are not present.
Also remove /etc/pbs.conf and the PBS_HOME (/var/spool/pbs) direcory.

PBS Pro and OpenMPI distribution packages will be built under ~/work on the
primary VM. The RPMs will be built in the standard rpmbuild location. Note
that these directories may already exist.
$ mkdir ~/work
$ mkdir ~/rpmbuild ~/rpmbuild/BUILD ~/rpmbuild/BUILDROOT ~/rpmbuild/RPMS \
        ~/rpmbuild/SOURCES ~/rpmbuild/SPECS ~/rpmbuild/SRPMS

Build PBS Professional
======================
$ cd ~/work
$ curl -so - https://codeload.github.com/PBSPro/pbspro/tar.gz/v19.1.1beta1 | \
        gzip -cd | tar -xf -
$ cd pbspro-19.1.1beta1
$ ./autogen.sh
[output omitted]
[ Note: You will see several "wildcard" warnings in the output because wildcard directives are used in some of the Makefile.am files. These messages may be ignored. ] $ ./configure PBS_VERSION='19.1.0' --prefix=/opt/pbs [output omitted] $ make dist [output omitted] $ cp pbspro-19.1.0.tar.gz ~/rpmbuild/SOURCES $ cp pbspro.spec ~/rpmbuild/SPECS $ cd ~/rpmbuild/SPECS $ rpmbuild -ba --with ptl pbspro.spec Install PBS Professional ======================== This example is run on CentOS using yum. Adjust accordingly for the OS. $ cd ~/rpmbuild/RPMS/x86_64 $ sudo yum install pbspro-server-19.1.0-0.x86_64.rpm [output omitted] Optionally, install PTL: $ sudo yum install pbspro-ptl-19.1.0-0.x86_64.rpm - Set PBS_START_MOM=1 on the primary VM - Start PBS Pro on the primary VM - Copy the pbspro-execution RPM to the secondary VM and install it - Start PBS Pro on the secondary VM - Use qmgr to add the secondary VM to the complex - Confirm that the secondary VM is available (pbsnodes -av) Build OpenMPI ============= The current release as of January 4, 2019 is OpenMPI 4.0.0 $ cd ~/work $ curl -sO https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.0.tar.gz $ tar -xOf openmpi-4.0.0.tar.gz */contrib/dist/linux/openmpi.spec | \ sed 's/$VERSION/4.0.0/' | sed 's/$EXTENSION/gz/' >openmpi.spec $ cp openmpi.spec ~/rpmbuild/SPECS $ cp openmpi-4.0.0.tar.gz ~/rpmbuild/SOURCES $ cd ~/rpmbuild/SPECS $ rpmbuild -D 'configure_options --without-slurm --with-tm=/opt/pbs' \ -D 'install_in_opt 1' -ba openmpi.spec

[ Note: Versions of PBS Pro prior to 18.x require an environment variable to be set prior to building OpenMPI... LIBS=-ldl ] Install OpenMPI =============== This example is run on CentOS using yum. Adjust accordingly for the OS. $ cd ~/rpmbuild/RPMS/x86_64 $ sudo yum install openmpi-4.0.0-*.rpm [output omitted] Add profile scripts for OpenMPI: $ cd ~/work $ cat <<'EOF' >openmpi.sh PATH=${PATH}:/opt/openmpi/4.0.0/bin MANPATH=${MANPATH}:/opt/openmpi/4.0.0/man EOF $ sudo cp openmpi.sh /etc/profile.d/openmpi.sh $ cat <<'EOF' >openmpi.csh setenv PATH ${PATH}:/opt/openmpi/4.0.0/bin setenv MANPATH ${MANPATH}:/opt/openmpi/4.0.0/man EOF $ sudo cp openmpi.csh /etc/profile.d/openmpi.csh Copy the RPM to the secondary VM and install it there as well. Copy the /etc/profile.d/openmpi.* scripts to the secondary VM. ==================================================================== STOP! STOP! STOP! STOP! STOP! STOP! STOP! STOP! STOP! STOP! ==================================================================== Before you proceed, log out and log back in. This will cause your login shell to process the new files in /etc/profile.d and setup your PATH and MANPATH correctly. Once you have logged back in, ensure your PATH and MANPATH contain references to the appropriate directories. This may include PTL if it was installed. As an alternative, you may source the files directly from your login shell without logging out. ==================================================================== Compile and Run a Job with OpenMPI ================================== $ cd ~/work $ cat <<'EOF' >>hello_mpi.c #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <limits.h> #include <mpi.h> int main(int argc, char* argv[]) { int rank, size; char hostname[HOST_NAME_MAX]; void *appnum; void *univ_size; char *appstr, *unistr; int flag; char *envar; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_APPNUM, &appnum, &flag); if (NULL == appnum) { asprintf(&appstr, "UNDEFINED"); } else { asprintf(&appstr, "%d", *(int*)appnum); } MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_UNIVERSE_SIZE, &univ_size, &flag); if (NULL == univ_size) { asprintf(&unistr, "UNDEFINED"); } else { asprintf(&unistr, "%d", *(int*)univ_size); } gethostname(hostname, sizeof(hostname)); envar = getenv("OMPI_UNIVERSE_SIZE"); printf("Rank:%d/%d Host:%s App#:%s MPI_UNIVERSE_SIZE:%s OMPI_UNIVERSE_SIZE:%s\n", rank, size, hostname, appstr, unistr, (NULL == envar) ? "NULL" : envar); MPI_Finalize(); return 0; } EOF $ mpicc -o hello_mpi hello_mpi.c $ ssh mom-2 mkdir work $ scp hello_mpi mom-2:work $ cat <<EOF >mpijob #PBS -l select=4:ncpus=1:mem=64m #PBS -j oe mpirun ~/work/hello_mpi EOF $ qsub mpijob 8.pbs-server $ cat mpijob.o8 Rank:0/4 Host:pbs-server App#:0 MPI_UNIVERSE_SIZE:4 OMPI_UNIVERSE_SIZE:4 Rank:2/4 Host:mom-2 App#:0 MPI_UNIVERSE_SIZE:4 OMPI_UNIVERSE_SIZE:4 Rank:3/4 Host:mom-2 App#:0 MPI_UNIVERSE_SIZE:4 OMPI_UNIVERSE_SIZE:4 Rank:1/4 Host:pbs-server App#:0 MPI_UNIVERSE_SIZE:4 OMPI_UNIVERSE_SIZE:4 Mom logs from pbs-server (where ranks 0 and 1 were run): ======================================================= 01/07/2019 14:21:06;0008;pbs_mom;Job;8.pbs-server;nprocs: 315, cantstat: 0, nomem: 0, skipped: 0, cached: 0 01/07/2019 14:21:06;0008;pbs_mom;Job;8.pbs-server;Started, pid = 120710 01/07/2019 14:21:07;0080;pbs_mom;Job;8.pbs-server;task 00000001 terminated 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;Terminated 01/07/2019 14:21:07;0100;pbs_mom;Job;8.pbs-server;task 00000001 cput= 0:00:00 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;kill_job 01/07/2019 14:21:07;0100;pbs_mom;Job;8.pbs-server;pbs-server cput= 0:00:00 mem=424kb 01/07/2019 14:21:07;0100;pbs_mom;Job;8.pbs-server;mom-2 cput= 0:00:00 mem=0kb 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;no active tasks 01/07/2019 14:21:07;0100;pbs_mom;Job;8.pbs-server;Obit sent Mom logs from mom-2 (where ranks 2 and 3 were run): 01/07/2019 14:21:06;0008;pbs_mom;Job;8.pbs-server;JOIN_JOB as node 1 01/07/2019 14:21:06;0008;pbs_mom;Job;8.pbs-server;task 20000001 started, orted 01/07/2019 14:21:07;0080;pbs_mom;Job;8.pbs-server;task 20000001 terminated 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;KILL_JOB received 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;kill_job 01/07/2019 14:21:07;0100;pbs_mom;Job;8.pbs-server;task 20000001 cput= 0:00:00 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;DELETE_JOB received 01/07/2019 14:21:07;0008;pbs_mom;Job;8.pbs-server;kill_job Notes: ====== - If the user's home directory were shared across both VMs (e.g. via NFS) there would have been no need to create the work directory or copy the hello_mpi binary to mom-2.


Testing

test.sh:

#!/bin/bash
#PBS -N pbs-openmpi-sh
#PBS -l select=2:ncpus=2:mpiprocs=2
#PBS -l place=scatter
/opt/openmpi/1.10.7/bin/mpirun -np `cat $PBS_NODEFILE | wc -l` /bin/hostname
qsub  pbs-openmpi.sh




OSS Site Map

Developer Guide Pages