We show how to use mpi4py for parallel programming with a few simple examples, so that readers can quickly get started with mpi4py. These examples are from mpi4py's Document, with some appropriate changes.

point-to-point communication

Pass generic Python objects (blocking)

This approach is very simple and easy to use, and works with any Python object that can be serialized by pickle, but pickle and unpickle operations on the sending and receiving sides are not efficient, especially when passing large amounts of data. In addition, blocking communication will block the execution of the process during message delivery.

# p2p_blocking.py

from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = {'a': 7, 'b': 3.14}
    print('process %d sends %s' % (rank, data))
    comm.send(data, dest=1, tag=11)
elif rank == 1:
    data = comm.recv(source=0, tag=11)
    print('process %d receives %s' % (rank, data))

The results are as follows:

$ mpiexec -n 2 python p2p_blocking.py
process 0 sends {'a': 7, 'b': 3.14}
process 1 receives {'a': 7, 'b': 3.14}

Pass generic Python objects (non-blocking)

This approach is very simple and easy to use, and works with any Python object that can be serialized by pickle, but pickle and unpickle operations on the sending and receiving sides are not efficient, especially when passing large amounts of data. Non-blocking communication can overlap communication and computation to greatly improve performance.

# p2p_non_blocking.py

from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = {'a': 7, 'b': 3.14}
    print('process %d sends %s' % (rank, data))
    req = comm.isend(data, dest=1, tag=11)
    req.wait()
elif rank == 1:
    req = comm.irecv(source=0, tag=11)
    data = req.wait()
    print('process %d receives %s' % (rank, data))

The results are as follows:

$ mpiexec -n 2 python p2p_non_blocking.py
process 0 sends {'a': 7, 'b': 3.14}
process 1 receives {'a': 7, 'b': 3.14}

Pass numpy arrays (efficient and fast way)

For data like arrays, to be precise, Python objects with a single-segment buffer interface, such as numpy arrays and built-in bytes/string/array, etc., you can use a more efficient mode is passed directly without pickle serialization and recovery. Passing data this way requires the use of the communicator's methods starting with a capital letter, such as Send(), Recv(), Bcast(), Scatter(), Gather(), etc.

# p2p_numpy_array.py

import numpy
from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

# passing MPI datatypes explicitly
if rank == 0:
    data = numpy.arange(10, dtype='i')
    print('process %d sends %s' % (rank, data))
    comm.Send([data, MPI.INT], dest=1, tag=77)
elif rank == 1:
    data = numpy.empty(10, dtype='i')
    comm.Recv([data, MPI.INT], source=0, tag=77)
    print('process %d receives %s' % (rank, data))

# automatic MPI datatype discovery
if rank == 0:
    data = numpy.arange(10, dtype=numpy.float64)
    print('process %d sends %s' % (rank, data))
    comm.Send(data, dest=1, tag=13)
elif rank == 1:
    data = numpy.empty(10, dtype=numpy.float64)
    comm.Recv(data, source=0, tag=13)
    print('process %d receives %s' % (rank, data))

The results are as follows:

$ mpiexec -n 2 python p2p_numpy_array.py
process 0 sends [0 1 2 3 4 5 6 7 8 9]
process 1 receives [0 1 2 3 4 5 6 7 8 9]
process 0 sends [0.1.2.3.4.5.6.7.8.9.]
process 1 receives [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]

Collective communication

Broadcast

The broadcast operation copies the data of the root process to all other processes in the same group.

Broadcast generic Python objects

# bcast.py

from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = {'key1' : [7, 2.72, 2+3j],
            'key2' : ( 'abc', 'xyz')}
    print('before broadcasting: process %d has %s' % (rank, data))
else:
    data = None
    print('before broadcasting: process %d has %s' % (rank, data))

data = comm.bcast(data, root=0)
print('after broadcasting: process %d has %s' % (rank, data))

The results are as follows:

$ mpiexec -n 2 python bcast.py
before broadcasting: process 0 has {'key2': ('abc', 'xyz'), 'key1': [7, 2.72, (2+3j)]}
after broadcasting: process 0 has {'key2': ('abc', 'xyz'), 'key1': [7, 2.72, (2+3j)]}
before broadcasting: process 1 has None
after broadcasting: process 1 has {'key2': ('abc', 'xyz'), 'key1': [7, 2.72, (2+3j)]}

Broadcast numpy array

# Bcast.py

import numpy as np
from mpi4py import MPI


comm = MPI.COMM_WORLD
rank = comm.Get_rank()

if rank == 0:
    data = np.arange(10, dtype='i')
    print('before broadcasting: process %d has %s' % (rank, data))
else:
    data = np.zeros(10, dtype='i')
    print('before broadcasting: process %d has %s' % (rank, data))

comm.Bcast(data, root=0)

print('after broadcasting: process %d has %s' % (rank, data))

The results are as follows:

$ mpiexec -n 2 python Bcast.py
before broadcasting: process 0 has [0 1 2 3 4 5 6 7 8 9]
after broadcasting: process 0 has [0 1 2 3 4 5 6 7 8 9]
before broadcasting: process 1 has [0 0 0 0 0 0 0 0 0 0]
after broadcasting: process 1 has [0 1 2 3 4 5 6 7 8 9]

Scatter

The diverge operation radiates different messages from the root process in the group to other processes in the group.

Diverging generic Python objects

# scatter.py

from mpi4py import MPI


comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

if rank == 0:
    data = [ (i + 1)**2 for i in range(size) ]
    print('before scattering: process %d has %s' % (rank, data))
else:
    data = None
    print('before scattering: process %d has %s' % (rank, data))

data = comm.scatter(data, root=0)
print('after scattering: process %d has %s' % (rank, data))

The results are as follows:

$ mpiexec -n 3 python scatter.py
before scattering: process 0 has [1, 4, 9]
after scattering: process 0 has 1
before scattering: process 1 has None

Quickly get started with mpi4py

point-to-point communication

Pass generic Python objects (blocking)

Pass generic Python objects (non-blocking)

Pass numpy arrays (efficient and fast way)

Collective communication

Broadcast

Broadcast generic Python objects

Broadcast numpy array

Scatter

Diverging generic Python objects