Yes, it is as cryptic as it sounds . I finally (after a couple of months) finished to merge Christian’s MPI_Bcast() patch with my LibNBC patch to enable the usage of non-blocking collectives in HPL. The performance has to be investigated in more detail, LibNBC will probably provide benefits if lookahead is used. But this needs clearly more investigation. I posted it on the webpage because some people asked me to publish it … so feel free to play around with the patches. I’d also be happy to receive any feedback (yes, even bugs ).
The MPI_Bcast patch seems to break several MPI libraries (e.g. Open MPI (not the trunk and MPICH2) because it uses really huge datatypes. It works with MVAPICH and newer Open MPI trunk versions.