April | 2009 | Torsten Hoefler's blog

We’re now convening since more than a year and we just finished the 9th meeting! On the way, we released the rather unspectacular MPI-2.1 at EuroPVM 2008 in Dublin (but hey, everything is in a single document now!) which didn’t really change anything.

Then, we decided to go for MPI-2.2 which might change something but doesn’t break anything! We’re still unsure if we allow ABI changes though. But MPI-2.2 will certainly be source-code compliant (so a recompile might be required – which seems not that bad to me). So the MPI-2.2 process is supposed to guarantee quality. We use the trac system here at IU to manage the changes. Each “ticket” represents a change which has to be reviewed unofficially by at least four members of the Forum. Then, it can be read in front of the whole Forum at any meeting. Then, we have a first and a second vote and each successful ticket has to pass both. At the end, we vote for the inclusion of each chapter in MPI-2.2. Each ticket must go through this procedure and only a single state change is allowed during each meeting. This gives the Forum and the public a long time (>8 months) to review the proposals carefully. We also require an (open-source) implementation of each proposed change.

We’re discussing MPI-2.2 since several meetings – but the last (April’09) meeting was an important milestone! Since we plan to release MPI-2.2 at this year’s EuroPVM, we had to close the door. This means effectively, all tickets that have not been read in this meeting are postponed to MPI-3. I think we did pretty well and we’re within our schedule.

Some tickets that I think are interesting are:

Add a local Reduction Function – this enables the user to use MPI reduction operations locally (without communication). This is very useful for library implementors (e.g., implementing new collective routines on top of MPI)

Regular (non-vector) version of MPI_Reduce_scatter – this addresses a kind of missing functionality. The current Reduce_scatter should be Reduce_scatterv … but it isn’t. Anyway, if you ever asked yourself why the heck should I use Reduce_scatter then think about parallel matrix multiplication!An example is attached to the ticket.

Add MPI_IN_PLACE option to Alltoall – nobody knows why this is not in MPI-2. I suppose that it seemed complicated to implement (an optimized implementation is indeed NP hard), but we have a simple (non-optimal, linear time) algorithm to do it. It’s attached to the ticket :).

Fix Scalability Issues in Graph Topology Interface – this is in my opinion the most interesting/important addition in MPI-2.2. The graph topology interface in MPI-2.1 is horribly broken in that every process needs to provide the *full* graph to the library (which even in sparse graphs leads to $\Omega(P)$ memory *per node*). I think we have an elegant fix that enables fully distributed specification of the graph as well as each node specifies its neighbors. This will be even more interesting in MPI-3, when we start to use the topology as communication context.

Extending MPI_COMM_CREATE to create several disjoint sub-communicators from an intracommunicator -Neat feature that allows you to create multiple communicators with a single call!

Add MPI_IN_PLACE option to Exscan – again, don’t know why this is missing. The rationale that is given is not convincing.
Define a new MPI_Count Datatype – MPI-2.1 can’t send more than 2^31 (=2 Mio) objects on 32-bit systems right now – we should fix that!
Add const Keyword to the C bindings – most discussed feature I guess 🙂 – I am not sure about the consequences yet, but it seems nice to me (so far).

Allow concurrent access to send buffer – most programmers probably did not know that this is illegal, but it certainly is. For example:

int sendbuf;

MPI_Request req[2];
MPI_Isend(&sendbuf, 1, MPI_INT, 1, 1, MPI_COMM_WORLD, &req[0]);

MPI_Isend(&sendbuf, 1, MPI_INT, 2, 1, MPI_COMM_WORLD, &req[1]);
MPI_Waitall(2, &req);

is not valid! Two threads are also not allowed to concurrently send the same buffer. This proposal will allow such access.
MPI_Request_free bad advice to users – I personally think that MPI_Request_free is dangerous (especially in the context of threads) and does not provide much to the user. But we can’t get rid of it. … so let’s discourage users to use it!

Deprecate the C++ bindings – that’s funny, isn’t it? But look at the current C++ bindings, they’re nothing more then pimped C bindings and only create problems. Real C++ programmers would use Boot.MPI (which internally uses the C bindings ;)), right?

We made also some progress regarding MPI-3 where we can add more complex features that might (!) change the interface (but not break backwards compatibility).So we voted on Nonblocking Collective Operations (#109 my hobbyhorse) – and it passed unanimously!

For all votes, see votes.

Torsten Hoefler's blog

nothing spectacular

Fun with the N810’s GPS

The MPI Forum gathers momentum