Bits & Bytes online Edition

DRACO as an extension of HYDRA

Ingeborg Weidl, Hermann Lederer

The DRACO system was delivered in May and put into production in July 2016 as an extension of the HYDRA supercomputer. Through this added compute resource, the long waiting times on HYDRA shall be reduced, and HPC needs of recently established theory departments in institutes in Frankfurt, Hamburg and Stuttgart shall be satisfied.

DRACO consists of 879 compute nodes with Intel Haswell processors E5-2698v3 (at 2.3 GHz), with 32 cores and 128 GB of main memory per node. Four of the nodes contain 512 GB RAM, and 106 nodes are equipped with GPU accelerator cards of type GTX 980. The system has a total of about 28.000 cores, 116 TB RAM and a peak performance of about 1 PetaFlop/s. There are additional four login nodes and eight I/O nodes that serve the additional 1.5 PetaByte of disk storage. InfiniBand FDR14 is the same interconnect type as on HYDRA, but with a different blocking factor. The software stack on DRACO is as similar as possible to that on HYDRA, with two exceptions: For the batch system, the open source system SLURM is used, and for inter-node communication Intel MPI.

With respect to the total job mix, DRACO with its many 'compute islands' with non-blocking full fat tree interconnect up to 1024 cores (and blocking factor 1:8 between islands) shall be used for running the smaller batch jobs up to 32 nodes, while HYDRA (with a non-blocking full fat tree interconnect up to 1800 nodes (36.000 cores) in the 'large' island) will be reserved for the larger batch jobs.

Detailed information about the extension system DRACO can be found on