Parallel MUMPS solver with MPI
MUMPS supports parallelism using MPI, but Kwant currently only uses it in sequential mode.
MUMPS documentation
There are several issues to consider:
parallelizing the "solve" step
Reasonably standard. Controlling this will require looking in section 5.1.3 of the documentation
constructing the Hamiltonian
MUMPS supports several ways of specifying the matrix to solve for (see section 5.2 and 5.2.2 of the documentation)
- full matrix is provided on rank 0 (MUMPS will then split the matrix across the available cores)
- matrix is pre-split by the application that calls MUMPS
The first method is much easier to implement than the second, as the second would require hooking the Builder into MPI (a mess). One could imagine the following way of operating:
comm = ... # MPI communicator to parallelize over
syst = make_system().finalized()
parallel_mumps_solver.solve(syst, comm=comm) # not the actual API, just an example
The problem with this is that the system is actually constructed on all cores, even though MUMPS would only use the system on core 0! One can get around this by requiring the user to construct the system only on rank 0:
comm = ... # MPI communicator to parallelize over
syst = None
if comm.rank == 0:
syst = make_system().finalized()
# note that *everyone* must call `solve` in order to avoid a deadlock
parallel_mumps_solver.solve(syst, comm=comm)
This way the system is only constructed on rank 0 (so we save memory), however the user
has to make sure that they use MPI correctly (e.g. call solve
on all ranks to avoid a deadlock)