Non quia difficilia sunt non audemus, sed quia non audemus difficilia sunt
Home -> Publications
edited volumes
  Full CV [pdf]


  Past Events

Publications of Torsten Hoefler
Torsten Hoefler, P. Gottschling, Andrew Lumsdaine and Wolfgang Rehm:

 Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations

(Elsevier Journal of Parallel Computing (PARCO). Vol 33, Nr. 9, pages 624-633, Elsevier, ISSN: 0167-8191, Sep. 2007)


This paper presents a case study that analyzes the suitability and usage of non-blocking collective operations in parallel applications. As with their point-to-point counterparts, non-blocking collective operations provide the ability to overlap communication with computation and to avoid unnecessary synchronization. These operations are provided for MPI programs with LibNBC, a portable low-overhead implementation of non-blocking collective operations built on MPI-1. The straightforward applicability of the LibNBC is demonstrated by incorporating non-blocking collective operations into a parallel conjugate gradient solver. Although only minor changes are required to use them, non-blocking collective operations allow most of the communication costs to be hidden and provide performance improvements of up to 34%. We also show that, because of overlap, there is no significant performance difference between Gigabit Ethernet and InfiniBandTM for special cases of our calculation.


download article:


  author={Torsten Hoefler and P. Gottschling and Andrew Lumsdaine and Wolfgang Rehm},
  title={{Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations}},
  journal={Elsevier Journal of Parallel Computing (PARCO)},

serving:© Torsten Hoefler