Discamus continentiam augere, luxuriam coercere
Home -> Research -> Older Projects -> Abinit
NB Collectives
    MPI Topologies
    MPI Datatypes
    Network Topologies
    Ethernet BTL eth
    Older Projects
  Full CV [pdf]


  Past Events


Ab Initio calculations are widely used to calculate ground state energies of electronic systems. This can for example be used for structure optimization or to assess different electronic properties of materials. This calculations are used in the emerging field of Nanosciences as well as in more traditional areas as materials science.

My current work aims at optimizing and parallelizing the program ABINIT. The serial optimization will enhance the running time on particular CPUs and the enhanced parallelization will lead to efficient execution on hundreds of processors. Please refer to my publications for further details.

Analyzing ABINIT

We did several studies to analyze the running time of the parallel version of ABINIT on a cluster system. The study "A short Performance Analysis of Abinit on a Cluster System" Analyzes the scalability of different parallelization methods of ABINIT on a small commodity cluster system with 16 nodes. A second study "A short Performance Analysis of Abinit under different build environments" analyzes the influence of different build environments (math libs, compilers) on ABINIT version 4.5.2.

A complete callgraph of all possible calls in the source code, generated by a python script is available here (huge): Callgraph.eps - (2318.96 kb)
Callgraph.png - (1998.79 kb)
This callgraph is not very useful, so we analyzed a large ground-state calculation, which used 13GB memory and saved the whole wfk on the Disk (actually an extremely slow NFS). The tracefile of this attached as Prof.txt - (114.05 kb). This analysis shows, that the lion's share of the whole running time is used by the following functions:

Most time-consuming functions
name percentage
projbd 36.05
sg_ffty 14.82
opernl4a 10.34
opernl4b 8.7
sg_fftpx 6.65
The profiling results show also that 97.3% of the running time is spent in the subtree of vtowfk() and 83.6% in the subtree of cgwf(). The callgraphs for this calculation (generated with cgprof):
Cgprof.eps - (255.12 kb)
Cgprof.png - (153.6 kb)

serving:© Torsten Hoefler