Home Publications all years 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 theses techreports presentations edited volumes conferences Awards Research Teaching BLOG Miscellaneous Full CV [pdf] ross2019
Events

Past Events
|
Publications of Torsten Hoefler
Copyright Notice:
The documents distributed by this server have been provided by the
contributing authors as a means to ensure timely dissemination of
scholarly and technical work on a noncommercial basis. Copyright and all
rights therein are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their works here
electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the explicit
permission of the copyright holder.
T. Hoefler:
| | Efficient networking and programming of large-scale computing systems
(Presentation - presented in Palo Alto, CA, USA, Jun. 2015)
AbstractWe will discuss efficient techniques for
large scale datacenter networking. We start by introducing a
high-performance cost-effective network topology called Slim Fly that
approaches the theoretically optimal network diameter. Slim Fly is based
on graphs that approximate the solution to the degree-diameter problem.
We analyze Slim Fly and compare it to both traditional and state-of-the-art networks. Our analysis shows that Slim Fly has significant
advantages over other topologies in latency, bandwidth, resiliency,
cost, and power consumption. Finally, we propose deadlock-free routing
schemes and physical layouts for large computing centers as well as a
detailed cost and power model. We continue our discussion by considering
the endpoint interface. Here, we propose remote memory access
programming which offers abstractions to coordinate directly accessible
distributed memory domains. We continue by showing how RMA programming
simplifies the design and tuning and introduce MPI-3's RMA semantics as
a particuler example. We discuss our reference implementation for Cray
machines foMPI and demonstrate results with up to half a million
processes. We conclude the talk by addressing producer-consumer
synchronizations in task-based runtime environments and the new proposal
of notified access. Overall, we advocate RMA as a potential programming
model for scalable systems ranging from single-die multicores to
large-scale supercomputers.
Documentsdownload slides:  | | BibTeX | @misc{hoefler-networking-hp-labs, author={T. Hoefler}, title={{Efficient networking and programming of large-scale computing systems}}, year={2015}, month={Jun.}, location={Palo Alto, CA, USA}, source={http://www.unixer.de/~htor/publications/}, } |
|
|