Home Publications all years 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 theses techreports presentations edited volumes conferences Awards Research Teaching BLOG Miscellaneous Full CV [pdf] ross2019
Events

Past Events
|
Publications of Torsten Hoefler
Copyright Notice:
The documents distributed by this server have been provided by the
contributing authors as a means to ensure timely dissemination of
scholarly and technical work on a noncommercial basis. Copyright and all
rights therein are maintained by the authors or by other copyright
holders, notwithstanding that they have offered their works here
electronically. It is understood that all persons copying this
information will adhere to the terms and constraints invoked by each
author's copyright. These works may not be reposted without the explicit
permission of the copyright holder.
Citation Listings:
DBLP
CSB
Google Scholar
ACM Digital Library
Semantic Scholar
Research overview
|
Using Advanced MPI
|
Edited volumes |
|
|
|
Peer-Reviewed Conference or Journal ArticlesNIPS'18 | [7] Dan Alistarh, Torsten Hoefler, Mikael Johansson, Sarit Khirirat, Nikola Konstantinov, Cedric Renggli: | | The Convergence of Sparsified Gradient Methods In Advances in Neural Information Processing Systems 31, presented in Montreal, Canada, Curran Associates, Inc., Dec. 2018,  |
SC18 | [9] Heng Lin, Xiaowei Zhu, Bowen Yu, Xiongchao Tang, Wei Xue, Wenguang Chen, Lufei Zhang, Torsten Hoefler, Xiaosong Ma, Xin Liu, Weimin Zheng, Jingfang Xu: | | ShenTu: Processing Multi-Trillion Edge Graphs on Millions of Cores in Seconds In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC18) - Gordon Bell Award Finalist, presented in Denver, CO, USA, ACM, Nov. 2018,  |
GMD | [14] O. Fuhrer, T. Chadha, T. Hoefler, G. Kwasniewski, X. Lapillonne, D. Leutwyler, D. Luethi, C. Osuna, C. Schaer, T. C. Schulthess, H. Vogt: | | Near-global climate simulation at 1 km resolution: establishing a performance baseline on 4888 GPUs with COSMO 5.0 Geoscientific Model Development. Vol 11, Nr. 4, Copernicus Publications, May 2018,  |
IEEE TPDS | [26] Didem Unat, Anshu Dubey, Torsten Hoefler, John Shalf, Mark Abraham, Mauro Bianco, Bradford L. Chamberlain, Romain Cledat, H. Carter Edwards, Hal Finkel, Karl Fuerlinger, Frank Hannig, Emmanuel Jeannot, Amir Kamil, Jeff Keasler, Paul H J Kelly, Vitus Leung, Hatem Ltaief, Naoya Maruyama, Chris J. Newburn, and Miquel Pericas: | | Trends in Data Locality Abstractions for HPC Systems IEEE Transactions on Parallel and Distributed Systems (TPDS). Vol 28, Nr. 10, IEEE, Oct. 2017,  |
HPDC'17 | [31] M. Besta, M. Podstawski, L. Groner, E. Solomonik, T. Hoefler: | | To Push or To Pull: On Reducing Communication and Synchronization in Graph Computations In Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'17), presented in Washington, DC, USA, ACM, Jun. 2017, (acceptance rate: 19%)  |
SPAA'17 | [33] E. Solomonik, G. Ballard, J. Demmel, T. Hoefler: | | A Communication-Avoiding Parallel Algorithm for the Symmetric Eigenvalue Problem Nr. 11, In Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA'17), presented in Washington, DC, USA, pages 111--121, ACM, ISBN: 978-1-4503-4593-4, Jun. 2017,  |
IPDPS'17 | [34] M. Besta, F. Marending, E. Solomonik, T. Hoefler: | | SlimSell: A Vectorized Graph Representation for Breadth-First Search In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)  |
IPDPS'17 | [35] S. Di Girolamo, F. Vella and T. Hoefler: | | Transparent Caching for RMA Systems In Proceedings of the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS'17), presented in Orlando, FL, USA, IEEE, May 2017, (acceptance rate: 22%, 116/516)  |
CIAC'17 | [39] K. T. Foerster, L. Groner, T. Hoefler, M. Koenig, S. Schmid, R. Wattenhofer: | | Multi-agent Pathfinding with n Agents on Graphs with n Vertices: Combinatorial Classification and Tight Algorithmic Bounds In Algorithms and Complexity - 10th International Conference, {CIAC} 2017, Athens, Greece, May 24-26, 2017, Proceedings, presented in Athens, Greece, May 2017,  |
SC16 | [42] M. Martinasso, G. Kwasniewski, S. R. Alam, T. C. Shulthess, T. Hoefler: | | A PCIe Congestion-Aware Performance Model for Densely Populated Accelerator Servers In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 63:1--63:11, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))  |
SC16 | [43] W. Tang, B. Wang, S. Ethier, G. Kwasniewski, T. Hoefler, K. Z. Ibrahim, K. Madduri, S. Williams, L. Oliker, C. Rosales-Fernandez, T. Williams: | | Extreme Scale Plasma Turbulence Simulations on Top Supercomputers Worldwide In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 43:1--43:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))  |
SC16 | [45] T. Gysi, J. Baer, T. Hoefler: | | dCUDA: Hardware Supported Overlap of Computation and Communication In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC16), presented in Salt Lake City, Utah, pages 52:1--52:12, IEEE Press, ISBN: 978-1-4673-8815-3, Nov. 2016, (acceptance rate: 18% (82/446))  |
OOPSLA'16 | [46] Andrei Marian Dan, Patrick Lam, Torsten Hoefler, Martin Vechev: | | Modeling and Analysis of Remote Memory Access Programming In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, presented in Amsterdam, Netherlands, pages 129--144, ACM, ISBN: 978-1-4503-4444-9, Nov. 2016, Outstanding Paper Award at OOPSLA'16 (4/52)  |
Cluster'16 | [47] A. Calotoiu, D. Beckingsale, C. W. Earl, T. Hoefler, I. Karlin, M. Schulz, F. Wolf: | | Fast Multi-Parameter Performance Modeling Oct. 2016, Accepted at IEEE International Conference on Cluster Computing (Cluster'16) (acceptance rate: 24% (39/162))  |
HPDC'16 | [51] P. Schmid, M. Besta, T. Hoefler: | | High-Performance Distributed RMA Locks In Proceedings of the 25th Symposium on High-Performance Parallel and Distributed Computing (HPDC'16), Jun. 2016, (acceptance rate: 16% (20/129)) Karsten Schwan Best Paper Award at HPDC'16 (1/20)  |
SC15 | [56] T. Hoefler, R. Belli: | | Scientific Benchmarking of Parallel Computing Systems presented in Austin, TX, USA, pages 73:1--73:12, ACM, ISBN: 978-1-4503-3723-6, Nov. 2015, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) (acceptance rate: 22%, 79/358)  |
SC15 | [57] G. Kathareios, C. Minkenberg, B. Prisacari, G. Rodriguez, T. Hoefler: | | Cost-Effective Diameter-Two Topologies: Analysis and Evaluation presented in Austin, TX, USA, ACM, ISBN: 978-1-4503-3723-6, Nov. 2015, In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) (acceptance rate: 22%, 79/358)  |
HOTI'15 | [60] S. Di Girolamo, P. Jolivet, K. D. Underwood, T. Hoefler: | | Exploiting Offload Enabled Network Interfaces In Proceedings of the 23rd Annual Symposium on High-Performance Interconnects (HOTI'15), presented in Oracle Santa Clara Campus, CA, USA, IEEE, Aug. 2015, Best Student Paper at HOTI'15  |
ICS'15 | [61] S. Shudler, A. Calotoiu, T. Hoefler, A. Strube, F. Wolf: | | Exascaling Your Library: Will Your Implementation Meet Your Expectations? In Proceedings of the 29th International Conference on Supercomputing (ICS'15), presented in Newport Beach, CA, USA, pages 161--175, ACM, ISBN: 978-1-4503-3559-1, Jun. 2015, (acceptance rate: 25% (40/160))  |
ICS'15 | [63] T. Gysi, T. Grosser, T. Hoefler: | | MODESTO: Data-centric Analytic Optimization of Complex Stencil Programs on Heterogeneous Architectures In Proceedings of the 29th International Conference on Supercomputing (ICS'15), presented in Newport Beach, CA, USA, pages 177--186, ACM, ISBN: 978-1-4503-3559-1, Jun. 2015, (acceptance rate: 25% (40/160))  |
HPDC'15 | [66] S. Ramos, T. Hoefler: | | Cache Line Aware Optimizations for ccNUMA Systems In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing (HPDC'15) (short paper), presented in Portland, OR, USA, pages 85--88, ACM, ISBN: 978-1-4503-3550-8, Jun. 2015,  |
CFI'15 | [69] T. Lee, C. Pappas, C. Basescu, J. Han, T. Hoefler, A. Perrig: | | Source-Based Path Selection: The Data Plane Perspective In Proceedings of the 10th International Conference on Future Internet, presented in Seoul, Republic of Korea, pages 41--45, ACM, ISBN: 978-1-4503-3564-5, May 2015,  |
SC14 | [71] J. Domke, T. Hoefler, S. Matsuoka: | | Fail-in-Place Network Design: Interaction between Topology, Routing Algorithm and Failures presented in New Orleans, LA, USA, Nov. 2014, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)  |
SC14 | [72] K. B. Ferreira, P. Widener, S. Levy, D. Arnold, T. Hoefler: | | Understanding the Effects of Communication and Coordination on Checkpointing at Scale presented in New Orleans, LA, USA, Nov. 2014, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC14) (acceptance rate: 21%, 82/394)  |
EuroMPI'14 | [76] P. Widener, K. Ferreira, S. Levy, T. Hoefler: | | Exploring the effect of noise on the performance benefit of nonblocking allreduce In Proceedings of the 21st European MPI Users' Group Meeting, presented in Kyoto, Japan, pages 77:77--77:82, ACM, ISBN: 978-1-4503-2875-3, Sep. 2014, Invited to a journal special issue on top picks from EuroMPI'14.  |
HPDC'14 | [78] B. Prisacari, G. Rodriguez, P. Heidelberger, D. Chen, C. Minkenberg, T. Hoefler: | | Efficient Task Placement and Routing in Dragonfly Networks In Proceedings of the 23rd ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC'14), presented in Vancouver, Canada, ACM, Jun. 2014, (acceptance rate: 16%, 21/130)  |
ACM TACO | [84] B. Prisacari, G. Rodriguez, C. Minkenberg, T. Hoefler: | | Fast Pattern-Specific Routing for Fat Tree Networks ACM Transactions on Architecture and Code Optimization. Vol 10, Nr. 4, presented in New York, NY, USA, pages 36:1--36:25, ACM, ISSN: 1544-3566, Dec. 2013, (acceptance rate: 24% (2011))  |
SC13 | [85] A. Calotoiu, T. Hoefler, M. Poke, F. Wolf: | | Using Automated Performance Modeling to Find Scalability Bugs in Complex Codes In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 45:1--45:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)  |
SC13 | [86] R. Gerstenberger, M. Besta, T. Hoefler: | | Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, presented in Denver, Colorado, USA, pages 53:1--53:12, ACM, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457) Best Student Paper Finalist (8/92) and SC13 Best Paper (1/92)  |
SC13 | [87] A. Friedley, G. Bronevetsky, A. Lumsdaine, T. Hoefler: | | Hybrid MPI: Efficient Message Passing for Multi-core Systems In IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC13), presented in Denver, Colorado, USA, pages 18:1--18:11, ISBN: 978-1-4503-2378-9, Nov. 2013, (acceptance rate: 20%, 92/457)  |
PMBS'13 | [88] S. Levy, B. Topp, K. Ferreira, D. Arnold, T. Hoefler, P. Widener: | | Using Simulation to Evaluate the Performance of Resilience Strategies at Scale presented in Denver, CO, USA, Nov. 2013, Proceedings of the 4th International Workshop in Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS13)  |
ICPP'13 | [89] T. Schneider, T. Hoefler, R. Grant, B. Barrett, R. Brightwell: | | Protocols for Fully Offloaded Collective Operations on Accelerated Network Adapters In Parallel Processing (ICPP), 2013 42nd International Conference on, presented in Lyon, France, pages 593-602, ISSN: 0190-3918, Oct. 2013,  |
HPDC'13 | [93] S. Li, T. Hoefler and M. Snir: | | NUMA-Aware Shared Memory Collective Communication for MPI In Proceedings of the 22nd international symposium on High-performance parallel and distributed computing, presented in New York City, NY, USA, pages 85--96, ACM, ISBN: 978-1-4503-1910-2, Jun. 2013, (acceptance rate: 15%, 20/131) Nominated for Best Paper Award at HPDC'13 (3/20)  |
ICS'13 | [94] B. Prisacari, G. Rodriguez, C. Minkenberg and T. Hoefler: | | Bandwidth-optimal All-to-all Exchanges in Fat Tree Networks In Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, presented in Eugene, OR, USA, pages 139--148, ACM, ISBN: 978-1-4503-2130-3, Jun. 2013, (acceptance rate: 21%, 41/198)  |
PPoPP'13 | [96] A. Friedley, T. Hoefler, G. Bronevetsky, A. Lumsdaine: | | Ownership Passing: Efficient Distributed Memory Programming on Multi-core Systems In Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming, presented in Shenzen, China, pages 177--186, ACM, ISBN: 978-1-4503-1922-5, Feb. 2013, (acceptance rate: 18%, 26/146)  |
SC12 | [97] T. Hoefler, T. Schneider: | | Optimization Principles for Collective Neighborhood Communications In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, presented in Salt Lake City, Utah, USA, pages 98:1--98:10, IEEE Computer Society Press, ISBN: 978-1-4673-0804-5, Nov. 2012, (acceptance rate: 21%, 100/472)  |
EuroMPI'12 | [98] T. Schneider, R. Gerstenberger, T. Hoefler: | | Micro-Applications for Communication Data Access Patterns and MPI Datatypes Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, pages 121-131, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to a journal special issue on top picks from EuroMPI'12.  |
EuroMPI'12 | [99] S. Pellegrini, T. Hoefler, T. Fahringer: | | Exact Dependence Analysis for Increased Communication Overlap In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012,  |
EuroMPI'12 | [100] T. Hoefler, J. Dinan, D. Buntinas, P. Balaji, B. Barrett, R. Brightwell, W. Gropp, V. Kale, R. Thakur: | | Leveraging MPI's One-Sided Communication Interface for Shared-Memory Programming Vol 7490, In Recent Advances in the Message Passing Interface - 19th European MPI Users' Group Meeting, EuroMPI 2012, Vienna, Austria, September 23-26, 2012. Proceedings, presented in Vienna, Austria, Springer, ISBN: 978-3-642-33517-4, Sep. 2012, Invited to journal special issue on top picks from EuroMPI'12.  |
PACT'12 | [101] T. Hoefler, T. Schneider: | | Runtime Detection and Optimization of Collective Communication Patterns In Proceedings of the 21st international conference on Parallel Architectures and Compilation Techniques (PACT), presented in Minneapolis, MN, USA, pages 263--272, ACM, ISBN: 978-1-4503-1182-3, Sep. 2012, (acceptance rate: 18.9%, 39/207)  |
Cluster'12 | [102] S. Pellegrini, T. Hoefler, T. Fahringer: | | On the Effects of CPU Caches on MPI Point-to-Point Communications In Proceedings of the 2012 IEEE International Conference on Cluster Computing, presented in Beijing, China, pages 495--503, IEEE Computer Society, ISBN: 978-0-7695-4807-4, Sep. 2012, (acceptance rate: 28.9%, 58/200)  |
CCGrid'12 | [104] G. Bauer, S. Gottlieb and T. Hoefler: | | Performance Modeling and Comparative Analysis
of the MILC Lattice QCD Application su3 rmd In Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012), presented in Ottawa, Canada, pages 652--659, IEEE Computer Society, ISBN: 978-0-7695-4691-9, May 2012, (acceptance rate: 27%, 83/302)  |
PDP'12 | [105] K. Kharbas, D. Kim, T. Hoefler and F. Mueller: | | Assessing HPC Failure Detectors for MPI Jobs In Proceedings of the 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, presented in Munich, Germany, pages 81--88, IEEE Computer Society, ISBN: 978-0-7695-4633-9, Feb. 2012,  |
EuroMPI'11 | [109] W. Gropp, T. Hoefler, R. Thakur and J. L. Traeff: | | Performance Expectations and Guidelines for MPI Derived Datatypes Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 150-159, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,  |
EuroMPI'11 | [110] V. Venkatesan, M. Chaarawi, E. Gabriel and T. Hoefler: | | Design and Evaluation of Nonblocking Collective I/O Operations Vol 6960, In Recent Advances in the Message Passing Interface (EuroMPI'11), presented in Santorini, Greece, pages 90-98, Springer, ISBN: 978-3-642-24448-3, Sep. 2011,  |
EuroMPI'11 | [111] T. Hoefler and M. Snir: | | Writing Parallel Libraries with MPI - Common Practice, Issues, and Extensions Vol 6960, In Recent Advances in the Message Passing Interface - 18th European MPI Users' Group Meeting, EuroMPI 2011, Santorini, Greece, September 18-21, 2011. Proceedings, presented in Santorini, Greece, pages 345--355, Springer, ISBN: 978-3-642-24448-3, Sep. 2011, Keynote paper at IMUDI/EuroMPI 2011.  |
EuroPar'11 | [112] T. Schneider, S. Eckelmann, T. Hoefler, and W. Rehm: | | Kernel-Based Offload of Collective Operations - Implementation, Evaluation and Lessons Learned In Proceedings of the 17th international conference on Parallel processing - Volume Part II, presented in Bordeaux, France, pages 264--275, Springer-Verlag, ISBN: 978-3-642-23396-8, Aug. 2011, (acceptance rate 29.9%, 81/271)  |
TG'11 | [113] S. Harrell, P. Smith, D. Smith, T. Hoefler, A. Labutina and T. Overmeyer: | | Methods of Creating Student Cluster Competition Teams In Proceedings of the 2011 TeraGrid Conference: Extreme Digital Discovery, presented in Salt Lake City, Utah, pages 50:1--50:6, ACM, Jul. 2011,  |
ICS'11 | [115] J. Willcock, T. Hoefler, N. Edmonds and A. Lumsdaine: | | Active Pebbles: Parallel Programming for Data-Driven Applications In Proceedings of the 2011 ACM International Conference on Supercomputing (ICS'11), presented in Tucson, AZ, pages 235--245, ACM, ISBN: 978-1-4503-0102-2, Jun. 2011, (acceptance rate 21.7%, 35/161)  |
IPDPS'11 | [117] J. Domke, T. Hoefler and W. Nagel: | | Deadlock-Free Oblivious Routing for Arbitrary Topologies In Proceedings of the 25th IEEE International Parallel \& Distributed Processing Symposium (IPDPS), presented in Anchorage, AL, USA, pages 613--624, IEEE Computer Society, ISBN: 0-7695-4385-7, May 2011, (acceptance rate: 19.6%, 112/571)  |
PPL | [118] P. Balaji, D. Buntinas, D. Goodell, W. Gropp, T. Hoefler, S. Kumar, E. Lusk, R. Thakur and J. L. Traeff: | | MPI on Millions of Cores Parallel Processing Letters (PPL). Vol 21, Nr. 1, pages 45-60, World Scientific Publishing Company, Mar. 2011,  |
PADL'11 | [120] E. Holk, W. E. Byrd, J. Willcock, T. Hoefler, A. Chauhan and A. Lumsdaine: | | Kanor -- A Declarative Language for Explicit Communication In Proceedings of the 13th international conference on Practical aspects of declarative languages, presented in Austin, TX, USA, pages 190--204, Springer-Verlag, ISBN: 978-3-642-18377-5, Jan. 2011,  |
EuroMPI'10 | [126] T. Hoefler, G. Bronevetsky, B. Barrett, B. R. de Supinski and A. Lumsdaine: | | Efficient MPI Support for Advanced Hybrid Programming Models Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 50--61, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,  |
EuroMPI'10 | [127] T. Hoefler, W. Gropp, R. Thakur and J. L. Traeff: | | Toward Performance Models of MPI Implementations for Understanding Application Scaling Issues Vol LNCS 6305, In Recent Advances in the Message Passing Interface (EuroMPI'10), presented in Stuttgart, Germany, pages 21--30, Springer, ISSN: 0302-9743, ISBN: 078-3-642-15645-8, Sep. 2010,  |
PACT'10 | [129] J. Willcock, T. Hoefler, N. Edmonds and A. Lumsdaine: | | AM++: A Generalized Active Message Framework In Proceedings of the 19th international conference on Parallel architectures and compilation techniques, presented in Vienna, Austria, pages 401--410, ACM, ISBN: 978-1-4503-0178-7, Sep. 2010, (acceptance rate: 17%, 46/266)  |
CCPE | [130] T. Hoefler, R. Rabenseifner, H. Ritzdorf, B. R. de Supinski, R. Thakur and J. L. Traeff: | | The Scalable Process Topology Interface of MPI 2.2 Concurrency and Computation: Practice and Experience. Vol 23, Nr. 4, pages 293-310, John Wiley & Sons, Ltd., ISSN: 1532-0634, Aug. 2010,  |
HotI'10 | [131] B. Arimilli, R. Arimilli, V. Chung, S. Clark, W. Denzel, B. Drerup, T. Hoefler, J. Joyner, J. Lewis, J. Li, N. Ni and R. Rajamony: | | The PERCS High-Performance Interconnect IBM. In Proceedings of 18th Symposium on High-Performance Interconnects (Hot Interconnects 2010), IEEE, Aug. 2010,  |
SciDAC'10 | [135] R. Thakur, P. Balaji, D. Buntinas, D. Goodell, W. Gropp, T. Hoefler, S. Kumar, E. Lusk and J. L. Traeff: | | MPI at Exascale In Procceedings of SciDAC 2010, presented in Chattanooga, Tennessee, Jun. 2010,  |
CAC'09 | [145] T. Hoefler, T. Schneider and A. Lumsdaine: | | A Power-Aware, Application-Based, Performance Study Of Modern Commodity Cluster Interconnection Networks In Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium, CAC'09 Workshop, presented in Rome, Italy, ISSN: 1530-2075, ISBN: 978-1-4244-3750-4, May 2009,  |
LCI'09 | [147] J. Mueller, T. Schneider, J. Domke, R. Geyer, M. Haesing, T. Hoefler, S. Hoehlig, G. Juckeland, A. Lumsdaine, M. Mueller and W. Nagel: | | Cluster Challenge 2008: Optimizing Cluster Configuration and Applications to Maximize Power Efficiency In In proceedings of the 10th LCI International Conference on High-Performance Clustered Computing, presented in Boulder, CO, Mar. 2009, LCI'09 Best Paper Award  |
EuroMPI'08 | [150] T. Hoefler, M. Schellmann, S. Gorlatch and A. Lumsdaine: | | Communication Optimization for Medical Image Reconstruction Algorithms Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 75-83, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,  |
EuroMPI'08 | [151] T. Hoefler, F. Lorenzen and A. Lumsdaine: | | Sparse Non-Blocking Collectives in Quantum Mechanical Calculations Vol LNCS 5205, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, 15th European PVM/MPI Users' Group Meeting, presented in Dublin, Ireland, pages 55-63, Springer, ISSN: 0302-9743, ISBN: 078-3-540-87474-4, Sep. 2008,  |
HotI'08 | [152] P. Geoffray and T. Hoefler: | | Adaptive Routing Strategies for Modern High Performance Networks In 16th Annual IEEE Symposium on High Performance Interconnects, HOTI'08, presented in Stanford, CA, USA, pages 165-172, IEEE Computer Society, ISBN: 978-0-7695-3380-3, Aug. 2008, (acceptance rate 30%, 14/47)  |
SPAA'08 | [153] T. Hoefler, P. Gottschling and A. Lumsdaine: | | Brief Announcement: Leveraging Non-blocking Collective Communication in High-performance Applications In Proceedings of the Twentieth Annual Symposium on Parallelism in Algorithms and Architectures, SPAA'08, presented in Munich, Germany, pages 113-115, Association for Computing Machinery (ACM), ISBN: 978-1-59593-973-9, Jun. 2008, (short paper) (acceptance rate: 28%, 36/128)  |
SC07 | [159] T. Hoefler, A. Lumsdaine and W. Rehm: | | Implementation and Performance Analysis of Non-Blocking Collective Operations for MPI In Proceedings of the 2007 International Conference on High Performance Computing, Networking, Storage and Analysis, SC07, presented in Reno, USA, IEEE Computer Society/ACM, Nov. 2007, (acceptance rate 20%, 54/268)  |
EuroMPI'07 | [160] T. Hoefler, P. Kambadur, R. L. Graham, G. Shipman and A. Lumsdaine: | | A Case for Standard Non-Blocking Collective Operations Vol 4757, In Recent Advances in Parallel Virtual Machine and Message Passing Interface, EuroPVM/MPI 2007, presented in Paris, France, pages 125-134, Springer, ISSN: 0302-9743, ISBN: 978-3-540-75415-2, Oct. 2007,  |
HPCC'07 | [162] T. Hoefler, T. Mehlan, A. Lumsdaine and W. Rehm: | | Netgauge: A Network Performance Measurement Framework Vol 4782, In Proceedings of High Performance Computing and Communications, HPCC'07, presented in Houston, USA, pages 659-671, Springer, ISBN: 978-3-540-75443-5, Sep. 2007,  |
FHPCN'06 | [166] T. Hoefler, J. Squyres, W. Rehm and A. Lumsdaine: | | A Case for Non-Blocking Collective Operations Vol 4331/2006, In Frontiers of High Performance Computing and Networking - ISPA'06 Workshops, presented in Sorrento, Italy, pages 155-164, Springer Berlin / Heidelberg, ISBN: 978-3-540-49860-5, Dec. 2006,  |
EuroMPI'06 | [168] T. Hoefler, P. Gottschling, W. Rehm and A. Lumsdaine: | | Optimizing a Conjugate Gradient Solver with Non-Blocking Collective Operations In Recent Advantages in Parallel Virtual
Machine and Message Passing Interface. 13th European PVM/MPI User's
Group Meeting, Proceedings, LNCS 4192, presented in Bonn, Germany, pages 374-382, Springer, ISSN: 0302-9743, ISBN: 3-540-39110-X, Sep. 2006, Invited to a journal special issue on top picks from EuroMPI'06.  |
PARELEC'06 | [169] T. Hoefler, C. Viertel, T. Mehlan, F. Mietke, W. Rehm: | | Assessing Single-Message and Multi-Node Communication Performance of InfiniBand In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 227-232, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,  |
PARELEC'06 | [170] T. Mehlan, J. Strunk, T. Hoefler, F. Mietke and W. Rehm: | | IRS - A portable Interface for Reconfigurable Systems In Proceedings of IEEE International Conference on Parallel Computing in Electrical Engineering (PARELEC'06), presented in Bialystok, Poland, pages 187-191, IEEE Computer Society, ISBN: 0-7695-2554-7, Sep. 2006,  |
DAPSYS'06 | [171] T. Hoefler, J. Squyres, G. Fagg, G. Bosilca, W. Rehm and A. Lumsdaine: | | A New Approach to MPI Collective Communication Implementations In Distributed and Parallel Systems - From Cluster to Grid Computing (DAPSYS'06), presented in Innsbruck, Austria, pages 45-54, Springer, ISBN: 978-0-387-69857-1, Sep. 2006,  |
EuroPar'06 | [172] F. Mietke, R. Baumgartl, R. Rex, T. Mehlan, T. Hoefler and W. Rehm: | | Analysis of the Memory Registration Process in the Mellanox InfiniBand Software Stack In Proceedings of Euro-Par 2006 Parallel Processing, presented in Dresden, Germany, pages 124-133, Springer-Verlag Berlin, ISBN: 3-540-37783-2, Aug. 2006, (acceptance rate 37.9%, 110/290)  |
HPCE'05 | [176] T. Hoefler, R. Janisch and W. Rehm: | | Improving the parallel scaling of ABINIT CINECA Consorzio Interuniversitario. In Science and Supercomputing in Europe - Report 2005, presented in Caseleccio di Reno, Italy, pages 551-559, CINECA Conzorzio Interuniversitario, ISBN: 88-86037-17-1, Dec. 2005,  |
Invited Talks and PresentationsZiH-TUD | [288] T. Hoefler: | | Non-blocking Collectives for MPI-2 (Presentation) Dresden University of Technology, Center for Information Services and High Performance Computing (ZIH). presented in Dresden, Germany, Oct. 2007,  |
Other Publications or Technical ReportsCASCON'07 | [313] T. Schneider, S. Wunderlich, W. Rehm, T. Hoefler and H. Schick: | | Code Optimization for Cell/B.E. - Opportunities for ABINIT In IBM CASCON 2006 Symposium, presented in Dublin, Ireland, IBM, Oct. 2007, Research Poster at the IBM CASCON 2006 Symposium, Dublin, Ireland  |
IUCS-TR | [315] T. Hoefler, J. Squyres, G. Bosilca, G. Fagg, A. Lumsdaine and W. Rehm: | | Non-Blocking Collective Operations for MPI-2 Open Systems Lab, Indiana University. presented in Bloomington, IN, USA, School of Informatics, Aug. 2006,  |
CIB-06-06 | [317] T. Hoefler, M. Reinhardt, F. Mietke, T. Mehlan, W. Rehm: | | Low Overhead Ethernet Communication for Open MPI on Linux Clusters TU Chemnitz. Vol CSR-06, Nr. 06, In Chemnitzer Informatik Berichte, presented in Chemnitz, TU Chemnitz, ISSN: 0947-5125, Jul. 2006,  |
22C3 | [320] T. Hoefler: | | The Cell Processor 22. Chaos Communication Congress. In 22C3 Proceedings, presented in Berlin, Germany, pages 286-292, ISBN: 3-934636-04-7, Dec. 2005,  |
21C3 | [327] T. Hoefler: | | Remote Network Analysis 21. Chaos Communication Congress. In 21C3 Proceedings, presented in Berlin, Germany, pages 33-37, ISBN: 3-934636-02-0, Dec. 2004,  |
Theses |