Full CV [pdf]
Advanced MPI at Speedup'15|
Advanced Parallel Programming with MPI
Torsten Hoefler, ETH Zurich
The new series of MPI standards (MPI-3.0 and MPI-2.2) react to several developments in HPC systems and applications that happened during the last decade. The modernized standard adds several key-concepts to deal with programming massively parallel modern hardware systems. In this tutorial, we will cover the three major concepts: (1) nonblocking collectives and flexible communicator creation, (2) greatly improved remote memory access (RMA) programming, and (3) topology mapping to improve locality and neighborhood ("build your own") collective operations. Nonblocking collectives enable to write applications that are resilient to small time variations (noise), overlap communication and computation, and enable new complex communication protocols. The new remote memory access semantics allow to efficiently exploit modern computing systems that offer RDMA but require a new way of thinking and developing applications. Topology mapping allows to specify the application's communication requirements and enables the MPI implementation to optimize the process-to-node mapping. Last but not least, neighborhood collectives form a powerful mechanism where programmers can specify their own collective operation and allow the MPI implementation to apply additional optimizations.
Introductory: 25%, Intermediate: 50%, Advanced: 25%
We generally assume a basic familiarity with MPI, i.e., attendees should be able to write and execute simple MPI programs. We also assume familiarity with general HPC concepts (i.e., a simple understanding of batch systems, communication and computation tradeoffs, and networks).
Examples: examples.tgz - (16.15 kb)
Slides: hoefler-advanced-mpi-speedup15.pdf - (9727.78 kb)