Home Publications all years 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 theses edited volumes presentations techreports conferences Awards Research Teaching BLOG Miscellaneous CV
Events
Recent Events
|
Publications of Torsten Hoefler
Copyright Notice:
The documents distributed by this server have been provided by the
contributing authors as a means to ensure timely dissemination of
scholarly and technical work on a noncommercial basis. Copyright and all
rights therein are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their works here
electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the explicit
permission of the copyright holder.
T. Schneider, S. Eckelmann, T. Hoefler, and W. Rehm:
| | | Kernel-Based Offload of Collective Operations - Implementation, Evaluation and Lessons Learned
(In Proceedings of the 17th international conference on Parallel processing - Volume Part II, presented in Bordeaux, France, pages 264--275, Springer-Verlag, ISBN: 978-3-642-23396-8, Aug. 2011)
AbstractOptimized implementations of blocking and nonblocking collective operations are most important for scalable high-performance applications. Offloading such collective operations into the communication
layer can improve performance and asynchronous progression of the operations. However, it is most important that such offloading schemes
remain flexible in order to support user-defined (sparse neighbor) collective communications. In this work, we describe an operating system
kernel-based architecture for implementing an interpreter for the flexible Group Operation Assembly Language (GOAL) framework to offload
collective communications. We describe an optimized scheme to store
the schedules that define the collective operations and show an extension to profile the performance of the kernel layer. Our microbenchmarks
demonstrate the effectiveness of the approach and we show performance
improvements over traditional progression in user-space. We also discuss
complications with the design and offloading strategies in general.
Documentsdownload article:  | | | BibTeX | @inproceedings{schneider-goal-kernel, author={T. Schneider and S. Eckelmann and T. Hoefler and and W. Rehm}, title={{Kernel-Based Offload of Collective Operations - Implementation, Evaluation and Lessons Learned}}, year={2011}, month={Aug.}, pages={264--275}, booktitle={Proceedings of the 17th international conference on Parallel processing - Volume Part II}, location={Bordeaux, France}, publisher={Springer-Verlag}, isbn={978-3-642-23396-8}, source={http://www.unixer.de/~htor/publications/}, } |
|
|