Indiana University’s new Datacenter (the bunker)

I’ve been at the inauguration (dedication ceremony) of the new $32 Mio datacenter. I have to say this building is impressive for a data center and reminds me of my times in the German army. I think everybody would agree to call it “the bunker”. It is designed to withstand an F5 tornado, but it might also survice plane crashes or bomb raids with its three feet (one meter) thick concrete walls. The new datacenter is really amazing and solves all problems of Wrubel (some of us might remember some power problems ;-)).

There is a nice vidoe at youtube: http://www.youtube.com/watch?v=zdHvnt3D7Tc

And this nice picture: 7984_h

UITS moved all servers and services in a single day! Well done guys!

ICPP 2009 in Vienna

I presented our initial work on Offloading Collective Operations, which is the definition of an Assembly language for group operations (GOAL), at ICPP’09 in Vienna. I was rather disappointed by this year’s ICPP. We had some problems with the program selection already before the conference (I’ll happily tell you details on request) and the final program was not great. Some talks were very entertaining though. I really enjoyed the P2S2 workshop, especially Pete Beckman’s keynote. Other highlights (in my opinion) include:

  • Mondrian’s “A resource optimized remote-memory-access architecture for low-latency communication” (I need to talk to those guys (I did ;))
  • Argonne’s “Improving Resource Availability By Relaxing Network Allocation Constraints on the Blue Gene/P” (I need to read the paper because I missed the talk due to chaotic re-scheduling, but Narayan’s 5-minute elevator pitch summary seemed very interesting)
  • Prof. Resch’s keynote on “Simulation Performance through Parallelism -Challenges and Options” (he even mentioned the German Pirate party which I really enjoyed!)
  • Brice work with Argonne on “Cache-Efficient, Intranode Large-Message MPI Communication with MPICH2-Nemesis”
  • Argonne’s “End-to-End Study of Parallel Volume Rendering on the IBM Blue Gene/P” (yes, another excellent Argonne talk right before my presentation :))

Here are some nice pictures:
vienna1
My talk at the last day was a real success (very well attended, even though it was the last talk in the conference)! It’s good to have friends (and a good talk from Argonne right before mine :-)). Btw. two of the three talks in the (only) “Information Retrieval” session were completely misplaced and had nothing to do with it, weird …

vienna2
My co-author, friendly driver, and camera-man and me in front of the parliament.

EuroPVM/MPI 2009 report

This year’s EuroPVM/MPI was held in Helsinki (not quite, but close to it). I stayed in Hanasaari, a beautiful island with a small hotel and conference center on it. It’s a bit remote but nicely surrounded by nature.

The conference was nice, I learned about formal verification of MPI programs in the first day’s tutorial. This technique seems really nice for non-deterministic MPI programs (how many are there?) but there are certainly some open problems (similar to the state explosion of thread-checkers). The remainder of the conference was very nice and it feels good to meet the usual MPI suspects again. Some highlights were in my opinion:

  • Edgar’s “VolpexMPI: an MPI Library for Execution of Parallel Applications on Volatile Nodes” (indeterminism is an interesting discussion in this context)
  • Rusty’s keynote on “Using MPI to Implement Scalable Libraries” (which I suspect could use collectives)
  • Argonne’s “Processing MPI Datatypes Outside MPI” (could be very very useful for LibNBC)
  • and Steven’s invited talk on “Formal Verification for Scientific Computing: Trends and Progress” (an excellent overview for our comunity)

The whole crowd:
image_preview

Unfortunately, I had to leave before the MPI Forum information session to catch my flight.
Videos of many talks are available at Videos. All-in-all, it was worth to attend. Next year’s EuroMPI (yes, the conference was finally renamed after the second year in a row without a PVM paper) will be in Stuttgart. So stay tuned and submit papers!

MPI 2.2 is now officially ratified by the MPI Forum!

I just came back from lunch after the MPI Forum meeting in Helsinki. This meeting focused again (the last time) on MPI 2.2. We finished the review of the final document and edited several minor things. Bill did a great job in chairing and pushing the MPI 2.2 work and the overall editing. Unfortunately, we did not meet our own deadlines, i.e., the chapters and reviews were not finished two weeks ago (I tried to push my chapters (5 and 7) as hard as possible, but getting the necessary reviews was certainly not easy). However, the whole document was reviewed (read) by forum members during the meeting and my confidence is high that everybody did a good job.

Here are the results of the official vote on the main document:
yes: 14
no: 1
abstain: 2 (did not participate)

The votes by chapter will be online soon.

The feature-set of the standard did not change. I posted it earlier here and Jeff also. But it’s official now! Implementors should now get everything implemented so that all users can enjoy the new features.

Here is a local copy (mirror) of the official document: mpi-report-2_2.pdf (the creation date might change)

One downside is that we already have errata items for things that were discovered too late in the process. This seems odd, however, we decided that we should not break our own rules. And even if the standard says that an MPI_BOOL is 4 bytes, we had to close the door for changes at some point. The errata (MPI_BOOL is one byte) will be voted on and posted soon on the main webpage.

Rolf will publish the MPI 2.2 book (like he did for MPI 2.1) and it will be available at Supercomputing 2009. I already ordered my copy :).

And now we’re moving on to MPI 3, so stay tuned (or participate in the forum)!

mpi-report-2.2-2009-09-04-as-1book

Hot Interconnects (HOTI’09) in New York

I attended the Hot Interconnects conference for the second time and it was as great as last year! This conference is rather convincing because it is a single-track conference with only a small number of highly interesting papers. And still, the attendance is huge, unlike on some other conferences where people only come when they have to present a paper and the audience is often sparse.

I gave a talk on static oblivious InfiniBand routing which was well received (I received a lot of questions and had very interesting discussions, especially during the breaks). Other highlights of the conference (in my opinion) were:

  • Fulcrum’s impressive 10 GiB FocalPoint switch design (the switch has full bandwidth at 10 GiB/s line-rate, real 10 GiB/s not 8 ;))
  • A paper about the implementation of collective communication on BG/P (I was hoping for a bit more theoretical background and a precise model of the BG/P network)
  • Some talks on optical networking were rather interesting
  • The panel about remote memory access over converged Ethernet was rather funny. Some people are seriously trying to implement the <irony> simple and intuitive </irony> OFED interface to Ethernet. I am wondering which real application (not MPI) uses OFED as communication API?

Here are some commented pictures:
ny1
View from the Empire State Building (credits go to Patrick!).

ny2
Another view from Empire State, the red arrow points at the conference location (the Credit Suisse Palace).

ny2
Times square (I think).

ny2
The Empire State “tip”.

ny2
Another downtown view.

ny6
The Empire State foyer.

ny2
I feel like in Paris ;).

ny2
We saw this scary building without windows as we walked from The Empire State down to the World Trade Center Site (weird).
ny9
*yeah* (seen next to the WTC site).

ny2
All we could see from the WTC site. It was not worth the long walk … but we talked anyway most of the time so this paid off.

ny2
The Wall Street (should be closed immediately!).

ny2
The subway is rather scary … seriously, New York!? Why do they have such a bad subway …

ny2
View from my (extremely cheap) hotel room. It was awesome, really!

ny14
The Credit Suisse Palace from the inside. Somebody has too much money (still).

The computer science publication system revisited

Sometimes I get to hear that computer science is not a real science, and sometimes I believe it myself. Science has many definitions and often refers to the scientific method that is used to systematically acquire and disseminate new facts about nature. I think in this sense, computer science has to be split in an applied part (engineering) and a theoretical part (mathematics). I worked on the engineering side for several years and reached the point where I am disappointed about its state. It seems to be less systematical than it needs to be, often focusing on mere implementation details in system xyz while forgetting about generality and nature. I think the main problems are missing reproducibility, the importance of conferences, the current review system, and the self-imposed pressure on the number of publications. This often leads to rushed publications only containing the “least publishable increment” or even wrong results. I also see many conference presentations that don’t present anything new but merely implemented something that is clear to work in theory. Rather than “uh, that’s a nice idea”, I feel more like “oh yes, they’ve implemented xyz with technology zyx”. I am also following the mathematical side of computer science and I have to say that there are different issues. It is good that the general state of the publication system in computer science is being discussed. I agree to most of the points from Lance Fortnow and Moshe Vardi. Both reports are worth a read! The usenix collected wisdom on the review system are also interesting! The last very relevant link deals with the evaluation of individuals in computer science. I’ll close by citing a note on double-blind reviews.

[Update] An excellent paper: “Stop the Numbers Game” by David Lorge Parnas

The MPI Standard MPI-2.2 is fixed now

We just finished all voting on the last MPI-2.2 tickets! This means that MPI-2.2 is fixed now, no changes are possible. The remaining work is simply to merge the accepted tickets into the final draft that will be voted on next time in Helsinki. I just finished editing my parts of the standard draft. Everything (substantial) that I proposed made it in with a reasonable quorum. The new graph topology interface was nicely accepted this time (I think I explained it better and I presented an optimized implementation). However, other tickets didn’t go that smooth. The process seems very interesting from a social perspective (the way we vote has a substantial impact on the results etc.).

Some tickets that I think are worth discussing are:

Add a local Reduction Function – this enables the user to use MPI reduction operations locally (without communication). This is very useful for library implementors (e.g., implementing new collective routines on top of MPI) – PASSED!


Regular (non-vector) version of MPI_Reduce_scatter – this addresses a kind of missing functionality. The current Reduce_scatter should be Reduce_scatterv … but it isn’t. Anyway, if you ever asked yourself why the heck should I use Reduce_scatter then think about parallel matrix multiplication! An example is attached to the ticket. – PASSED!

Add MPI_IN_PLACE option to Alltoall – nobody knows why this is not in MPI-2. I suppose that it seemed complicated to implement (an optimized implementation is indeed NP hard), but we have a simple (non-optimal, linear time) algorithm to do it. It’s attached to the ticket :). – PASSED!


Fix Scalability Issues in Graph Topology Interface
– this is in my opinion the most interesting/important addition in MPI-2.2. The graph topology interface in MPI-2.1 is horribly broken in that every process needs to provide the *full* graph to the library (which even in sparse graphs leads to $\Omega(P)$ memory *per node*). I think we have an elegant fix that enables fully distributed specification of the graph as well as each node specifies its neighbors. This will be even more interesting in MPI-3, when we start to use the topology as communication context. – PASSED!

Extending MPI_COMM_CREATE to create several disjoint sub-communicators from an intracommunicator -Neat feature that allows you to create multiple communicators with a single call! – PASSED!

Add MPI_IN_PLACE option to Exscan – again, don’t know why this is missing in MPI-2.0. The rationale that is given is not convincing. PASSED!

Define a new MPI_Count Datatype – MPI-2.1 can’t send more than 2^31 (=2 Mio) objects on 32-bit systems right now – we should fix that! However, we had to move this to MPI-3 due to several issues that came up during the implementation (most likely ABI issues) POSTPONED! It feels really good to have this strict implementation requirement! We will certainly have this important fix in MPI-3!

Add const Keyword to the C bindings – most discussed feature I guess 🙂 – I am not sure about the consequences yet, but it seems nice to me (so far). – POSTPONED! We moved this to MPI-3 because some part of the Forum wasn’t sure about the consequences. I am personally also going back and forth, the issue with strided datatypes seems really worrysome.

Allow concurrent access to send buffer – most programmers probably did not know that this is illegal, but it certainly is in MPI<=2.0. For example: int sendbuf; MPI_Request req[2]; MPI_Isend(&sendbuf, 1, MPI_INT, 1, 1, MPI_COMM_WORLD, &req[0]); MPI_Isend(&sendbuf, 1, MPI_INT, 2, 1, MPI_COMM_WORLD, &req[1]); MPI_Waitall(2, &req); is not valid! Two threads are also not allowed to concurrently send the same buffer. This proposal will allow such access. - PASSED!

MPI_Request_free bad advice to users – I personally think that MPI_Request_free is dangerous (especially in the context of threads) and does not provide much to the user. But we can’t get rid of it. … so let’s discourage users to use it! – PASSED!

Deprecate the C++ bindings – that’s funny, isn’t it? But look at the current C++ bindings, they’re nothing more then pimped C bindings and only create problems. Real C++ programmers would use Boot.MPI (which internally uses the C bindings ;)), right? – PASSED (even though I voted against it ;))

Something odd happened to New Predefined Datatypes. We found a small typo in the ticket (MPI_C_BOOL should be 1 instead of 4 bytes). However, it wasn’t small enough that we could just change it (the process doesn’t allow significant changes after the first vote). It was now voted in with this bug (I abstained after the heavy discussion though) and it’s also too late to file a new ticket to fix this bug. However, we will have an errata item that will clarify this. It might sound strange, but I’m very happy that we stick to our principles and don’t change anything without proper reviews (these reviews between the meetings where vendors could get user-feedback have influences tickets quite a lot in the past). But still PASSED!

For all tickets and votes, see MPI Forum votes!

I’m very satisfied with the way the Forum works (Bill Gropp is doing a great job with MPI-2.2), I hear about other standardization bodies and have to say that our rules seem very sophisticated. I think MPI-2.2 will be a nice new standard which is not only a bugfix but offers new opportunities to library developers and users (see the tickets above). We are also planning to have a book again (perhaps with an editorial comment addressing the issue in ticket 18 (MPI_C_BOOL)!

The day a department died

The Computer Science Department at Indiana University is no more. I was told that it was one of the oldest CS departments in the US. Now, the webpage says “The Computer Science Department is now School of Informatics and Computing” which is not correct. It’s actually only division in the school formerly known as “School of Informatics”.

The department seems to have turned into the “Computer Science Program”. Some people ask me what the difference is to the “Informatics Program” — I honestly don’t know.

I’ll keep the old logo for reference
cs_circuit

Here’s the new one
computerscienceprogram

Some mourning folks:
mourning