Passer au contenu principal

Run MPI with containers

Simple test

Simple container with ucx and openmpi.

Bootstrap: docker
From: debian:trixie

%environment
    export LD_LIBRARY_PATH=/usr/local/lib
    
%post
	apt-get update && apt-get install -y build-essential wget rdma-core libibverbs-dev
	wget https://github.com/openucx/ucx/releases/download/v1.18.1/ucx-1.18.1.tar.gz
	tar xzf ucx-1.18.1.tar.gz
	cd ucx-1.18.1
	mkdir build
	cd build
	../configure --prefix=/opt/
	make -j4
	make install
	cd ..
	export OPENMPI_VERSION="4.1.6"
	export OPENMPI_MAJOR_VERSION="v4.1"
	export OPENMPI_MAKE_OPTIONS="-j4"
	mkdir -p /openmpi-src
	cd /openmpi-src
	wget https://download.open-mpi.org/release/open-mpi/${OPENMPI_MAJOR_VERSION}/openmpi-${OPENMPI_VERSION}.tar.gz \
      	&& tar xfz openmpi-${OPENMPI_VERSION}.tar.gz
	cd openmpi-${OPENMPI_VERSION} && ./configure --with-ucx=/opt --without-verbs
	make all ${OPENMPI_MAKE_OPTIONS}
	make install
	cd /
	rm -rf /openmpi-src

To build it:

singularity build -f openmpitest.sif openmpi.def

Then we compile an MPI application inside the container. For example osu-benchmarks.

wget https://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-7.5-1.tar.gz
tar -xvf osu-micro-benchmarks-7.5-1.tar.gz
singularity shell openmpitest.sif
cd osu-micro-benchmarks-7.5-1
./configure CC=/usr/local/bin/mpicc CXX=/usr/local/bin/mpicxx --prefix=/scratch/$user/osu_install
make install

Then you can use the following job:

#!/bin/bash

#SBATCH -N 2
#SBATCH -n 2
#SBATCH -o mpi-%j.out
#SBATCH -e mpi-%j.err

module purge
module load singularityce
module load openmpi
export PMIX_MCA_psec=native
export PMIX_MCA_gds=^ds12

export SINGULARITY_BINDPATH=/scratch

srun --mpi=pmix singularity run openmpitest.sif /scratch/$user/osu-install/libexec/osu-micro-benchmarks/mpi/collective/osu_alltoall

Some possible errors

if the option --mpi=mpix is not used, you will have the following error:

[dna067:2560172] OPAL ERROR: Unreachable in file pmix3x_client.c at line 111
--------------------------------------------------------------------------
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.

By default it uses PMI2 to launch the process but the OpenMPI on the container does not have support for it. From OpenMPI 3.1.0 PMIx is included by default.

Psec error

You can also have this error:


A requested component was not found, or was unable to be opened.  This
means that this component is either not installed or is unable to be
used on your system (e.g., sometimes this means that shared libraries
that the component requires are unable to be found/loaded).  Note that
PMIX stopped checking at the first component that it did not find.

Host:      dna075
Framework: psec
Component: munge

Here, the application will run. This is related to the PMIX_SECURITY_MODE. When srun is executed, it will setup previous variable to: munge,native. Which means that munge protocol will be used for authentication. As the PMIx library on the container (client side) does not have that component, it will failed but it will then use the native component. You can read here for more explanations. You have to use export PMIX_MCA_psec=native to avoid this message.

gds error

You can also see this error:

[dna075:373342] PMIX ERROR: ERROR in file gds_ds12_lock_pthread.c at line 168

This is an OpenPMIx bug related to the 'Generalized DataStore for storing job-level and other data' component. You can blacklist it by setting: export PMIX_MCA_gds=^ds12.