Linux MPI

Introduction

Thayer School's linux systems have the OpenMPI ( http://www.open-mpi.org/) implementation of the MPI (Message Passing Interface) standard. MPI allows many processes spread across multiple CPUs and multiple systems to pass messages and data between them to work on parallel computations.

For more information about parallel programming with MPI, we suggest visiting the OpenMPI page. There is a lot of good documentation and examples here to get you started with parallel programming. Thayer Computing can also help with technical issues, and there is expertise at Dartmouth Research Computing to help with implementation and programming. If you have questions about or have trouble with MPI on Thayer systems, contact us at computing@thayer.dartmouth.edu and if we can't help directly, we can put you in touch with the correct people in Research Computing.

Compiling

If you are compiling your own MPI code, the easiest way to ensure that you have the correct headers and libraries is to use the OpenMPI "wrappers" for the compilers. For example, instead of using gcc, you would use mpicc:

$ mpicc -o mpiprog mpiprog.c

Here are the names of the wrappers for each of our standard compilers:

gcc -> mpicc
g++ -> mpicxx
gfortran -> mpif90

Running an MPI Job

Once you have a compiled program set up to use MPI, you can use the mpirun command to run it. The man page for mpirun ('man mpirun') contains the most up-to-date and extensive documentation, but here are a few examples. Note: To run jobs on multiple machines, the executable must be on shared storage - ThayerFS, Jumbo or DartFS.

To run myprog on babylon1:

mpirun -H babylon1 ./myprog

To run two instances of myprog on babylon1 and one on babylon2:

mpirun -H babylon1,babylon1,babylon2 ./myprog

To run four instances of myprog on babylon1:

mpirun -H babylon1,babylon1,babylon1,babylon1 ./myprog

Host File

Hosts and numbers of processors can also be specified in a hosts file. For example, if you create a file called myhosts with the following lines:

babylon1 slots=8
babylon2 slots=8

you can then specify this to mpirun:

mpirun -hostfile myhosts ./myprog

By default, it will use all available slots (sixteen in this example). You can alter this with the -np and/or -npernode options.

Only run eight processes (these will all be on babylon1, since these are the first eight available):

mpirun -hostfile myhosts -np 8 ./myprog

Run four each on babylon1 and babylon2:

mpirun -hostfile myhosts -np 8 -npernode 4 ./myprog

Additional Resources

Here are a few links that may be helpful related to MPI:
Dartmouth Research Computing Course - Introduction to Parallel Programming Using MPI (Message Passing Interface):
http://tech.dartmouth.edu/its/services-support/help-yourself/knowledge-base/research-computing-course-descriptions
Slides: http://www.dartmouth.edu/~rc/classes/intro_mpi/
Examples from Using MPI, 3rd edition: https://www.mcs.anl.gov/research/projects/mpi/usingmpi/examples-usingmpi/
NCSA Cyberinfrastructure Tutor: http://ci-tutor.ncsa.uiuc.edu/