Yes, it is as cryptic as it sounds :). I finally (after a couple of months) finished to merge Christian’s MPI_Bcast() patch with my LibNBC patch to enable the usage of non-blocking collectives in HPL. The performance has to be investigated in more detail, LibNBC will probably provide benefits if lookahead is used. But this needs clearly more investigation. I posted it on the webpage because some people asked me to publish it … so feel free to play around with the patches. I’d also be happy to receive any feedback (yes, even bugs :)).
The MPI_Bcast patch seems to break several MPI libraries (e.g. Open MPI (not the trunk 🙂 and MPICH2) because it uses really huge datatypes. It works with MVAPICH and newer Open MPI trunk versions.