Home Publications all years 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 theses edited volumes presentations techreports conferences Awards Research Teaching BLOG Miscellaneous CV
Events
Recent Events
|
Publications of Torsten Hoefler
Copyright Notice:
The documents distributed by this server have been provided by the
contributing authors as a means to ensure timely dissemination of
scholarly and technical work on a noncommercial basis. Copyright and all
rights therein are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their works here
electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the explicit
permission of the copyright holder.
T. Hoefler, T. Schneider and A. Lumsdaine:
| | | Characterizing the Influence of System Noise on Large-Scale Applications by Simulation
(In International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10), Nov. 2010) SC10 Best Paper Award
AbstractAlthough system noise is increasingly a concern as
HPC systems continue to grow in scale, existing studies with artificial noise
models provide only limited insight into application behavior. This paper
presents an in-depth analysis of the impact of system noise on large-scale
parallel application performance in realistic settings. Our analytical model
shows the particular circumstances under which noise is propagated or absorbed.
The model shows that not only collective operations but also point-to-point
communications influence the application's sensitivity to noise. We present a
simulation toolchain that injects noise delays from traces gathered on four
common large-scale architectures into a LogGPS simulation and allows new
insights into the scaling of applications in noisy environments. We investigate
collective operations with up to 1 million processes and three applications
(Sweep3D, AMG, and POP) with up to 32.000 processes. We show that the scale at
which noise becomes a bottleneck is system-specific and depends on the
structure of the noise. Simulations with different network speeds show that a
10x faster network does not improve application scalability because noise
becomes a bottleneck at scale. We quantify this noise bottleneck and conclude
that our tools can be utilized to tune the noise signatures of a specific
system for minimal noise propagation. For example, our simulations verify the
long-standing conjecture that co-scheduling prevents significant application
slowdown.
Documentsdownload article:  download slides:  | | | BibTeX | @inproceedings{hoefler-noise-sim, author={T. Hoefler and T. Schneider and A. Lumsdaine}, title={{Characterizing the Influence of System Noise on Large-Scale Applications by Simulation}}, year={2010}, month={Nov.}, booktitle={International Conference for High Performance Computing, Networking, Storage and Analysis (SC'10)}, source={http://www.unixer.de/~htor/publications/}, } |
|
|