Publications of Torsten Hoefler
T. Hoefler:

 Improving Parallel Computing Platforms

(Presentation - presented in Munich, Germany, Oct. 2009, Presentation at the Technical University of Munich, Host: Prof. M. Gerndt )


Large-scale parallel systems are important to advance scientific development in many fields. In this talk, we address issues in programming and design of such large-scale systems. We emphasize the importance of collective operations as high-level specifications of data redistribution and discuss new developments in the Message Passing Interface (MPI) standard versions 2.2 and 3. We discuss application studies and use-cases for new nonblocking collective operations. We also discuss a proposal for nearest neighbor (sparse) collective operations to support common stencil communication operations. Later in the talk, we discuss system issues in the design of large-scale systems. We present a case study based on the InfiniBand network architecture and evaluate effects of static routing strategies. We also disprove several mysteries about full bisection bandwidth networks. Based on this discussion, we develop a new routing strategy for InfiniBand networks and, if time permits, finish with a small excursion into adaptive routing. With this work, we show that large-scale systems must be analyzed and optimized as a whole. This means that we have to consider programming strategies and abstractions, network topologies and routing as a whole.


download slides:


