Berk Geveci     About     Archive     Feed

mpi4py and VTK

We recently added mpi4py as one of the third party libraries in VTK. Below is a quote from the mpi4py explaining what it is.

MPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.

This package is constructed on top of the MPI-1/MPI-2 specification and provides an object oriented interface which closely follows MPI-2 C++ bindings. It supports point-to-point (sends, receives) and collective (broadcasts, scatters, gathers) communications of any picklable Python object as well as optimized communications of Python object exposing the single-segment buffer interface (NumPy arrays, builtin bytes/string/array objects).

See the mpi4py page for details.

We have been using mpi4py in ParaView for several years and with the recent introduction of the numpy_interface module to VTK, we decided to move the mpi4py dependency to VTK as well. This allowed us to support data parallelism with MPI in the numpy_interface module. I will discuss this in an upcoming blog in more detail.

Using mpi4py is pretty straightforward. The following can be called from vtkpython, pvtkpython and python.

from mpi4py import MPI

Note that if you are going to mix parallel VTK and mpi4py, we recommend using pvtkpython, which initializes several VTK data structures that make it easier for algorithms to access MPI communicators.

VTK also provides a Python-accessible interface to MPI in the vtkMPIController and vtkMPICommunicator classes. However, these classes were not designed to be used from Python and as such provide only a small set of methods. Most often, you will use mpi4py when coding in Python.

In some cases, specially when using MPI groups, it is necessary to pass the communicator used by VTK to mpi4py or vice versa. We developed a simple utility class to enable this. This class is called vtkMPI4PyCommunicator and is used as follows.

import vtk
from mpi4py import MPI

# GlobalController is defined automatically when running pvtkpython
# Otherwise, you need to manually create a vtkMPIController and set
# it yourself.
contr = vtk.vtkMultiProcessController.GetGlobalController()
comm = vtkMPI4PyCommunicator.ConvertToPython(controller.GetCommunicator())

acomm = vtkMPI4PyCommunicator.ConvertToVTK(comm)
acontr = vtk.vtkMPIController()
acontr.SetCommunicator(acomm)

Since mpi4py works very nicely with numpy arrays and VTKArray is a subclass of numpy.ndarray (see previous posts [1], [2] and [3]), it is very straightforward to use them together as follows.

import vtk
from vtk.numpy_interface import dataset_adapter as dsa
from mpi4py import MPI
import numpy

gc = vtk.vtkMultiProcessController.GetGlobalController()

rank = gc.GetLocalProcessId()

fa = vtk.vtkFloatArray()
fa.SetNumberOfTuples(10)
fa.FillComponent(0, rank)
if rank == 0:
    fa.SetValue(3, 10)

vtk_array = dsa.vtkDataArrayToVTKArray(fa)
result = numpy.array(vtk_array)

comm = vtk.vtkMPI4PyCommunicator.ConvertToPython(gc.GetCommunicator())
comm.Allreduce([vtk_array, MPI.FLOAT], [result, MPI.FLOAT], MPI.MAX)

if rank == 0:
    print result

When this is executed as

mpiexec -n 2 pvtkpython parallel_array.py

it prints

[  1.   1.   1.  10.   1.   1.   1.   1.   1.   1.]

In my next blog, I will talk about how to use the algorithms module in parallel. Until then, happy Message Passing.

Many thanks to Ben Boeckel for moving mpi4py to VTK and implementing vtkMPI4PyCommunicator.

Note: This article was originally published on the Kitware blog. Please see the Kitware web site, the VTK web site and the ParaView web site for more information.