In this article we will introduce the DataHub service which has been realized to enhance options for moving large-scale data sets into and out of the MPCDF. The DataHub server provides a general Globus Online staging server with an added sftp server for data management within the MPCDF.

Globus Online ([Globus.org](http://globus.org)) is a free service which allows users to move large volumes of data in a simple and reliable manner. In general Globus Online requires sites to set up a Globus server for data transfers. However, individual users can also install a personal client to enable them to move data to/from Globus servers. The DataHub allows users to move data between Globus enabled sites and/or use Globus personal clients to enable data movement between a computing center and the MPCDF or their laptop. Figure 1 shows the role of the DataHub server for data transfers to/from the MPCDF.

Within the MPCDF users can use Globus Online or sftp to move data to/from Linux clusters and/or desktops and laptops. For external systems, users can use Globus Online to move data between Globus enabled sites and/or Globus personal clients, staging data over the DataHub if needed. The DataHub is available via Globus Online with the endpoint name mpcdf#datahub and via sftp within the MPCDF via sftp (username)@datahub.mpcdf.mpg.de. An example workflow, importing data into MPCDF systems, can be achieved as follows:

1. Copy data from an external Globus Online server to the MPCDF DataHub.
2. Copy data from the MPCDF DataHub to an internal MPCDF system (using Globus or sftp).

Similar workflows are possible using Globus connect personal clients which can be deployed on any external system.

In conclusion:

The DataHub server provides a Globus Online endpoint for general use by MPCDF users and also enables sftp for data management within the MPCDF.

Please note: The storage provided by DataHub is scratch based (temporary) and will be regularly cleaned. The current deletion policy is designed to ensure that data older than two weeks will be deleted.
Yacora on the Web

Raphael Ritz

The website www.yacora.de was built to make simulating some of the existing collisional radiative models based on the flexible package Yacora – yet another collisional radiative model developed at the IPP – available without the need to install any software. In order to use Yacora on the web, users have to register first. Only a username, a password and the affiliation are required. After the registration, the user gets access to pages where it is possible to specify and submit input parameters for Yacora. In addition to automatic validation of the input parameters, an offline check follows to make sure the system can handle the request. On approval the simulation run is triggered. This procedure is also illustrated in Figure 2. On completion of the simulation the user is notified that the results are available for inspection and download from his user folder.

That way, people are able to use the Yacora simulation tool without having to download and install any software on their side. Detailed descriptions of the models supported are available from the help section or from yacora-webmaster@ipp.mpg.de. Should you have interest in making your code available via a similar web based portal please feel free to contact us.

Figure 2. Workflow of Yacora on the web

High-performance Computing

Ingeborg Weidl, Renate Dohmen, Christian Guggenberger, Hermann Lederer

HPC cluster Hydra

After five years of operation, the 628 'SandyBridge' nodes of the Hydra cluster were decommissioned in September 2017. The MPG has contracted a first phase of a successor system for Hydra (see below). To prepare for the installation of this system, another part of the Hydra cluster (628 'IvyBridge' nodes) was taken out of operation in November. The Hydra cluster now has 61,240 cores with a total of 204 TB of main memory. Due to the smaller number of compute nodes, the queue waiting times of batch jobs on Hydra can increase a bit.

HPC extension cluster Draco

At the end of August 2017, the Draco cluster was expanded by 50 Intel 'Broadwell' nodes (each with 32 cores, 128 GB memory) that were purchased by the MPI für Struktur und Dynamik der Materie. Another 12 'Broadwell' nodes were added by the MPI für Ornithologie. In total there are now 32,528 cores available with a main memory of 136 TB and a peak performance of 1.18 PFlop/s.

Hydra follow-on system

As a partial replacement of the Hydra system, the MPG has done a competitive European procurement and selected an Intel 'Skylake' processor based system from Intel and Atos. Delivery is underway, the installation will take place in January with an expected start of operation in February/March 2018. The system contains more than 2500 compute nodes with 40 cores each. Half of the nodes are equipped with 96 GB RAM, the other half with 192 GB RAM, and eight nodes contain 768 GB RAM. Interconnect is Intel OmniPath. The theoretical peak performance will be 8 PF. The batch system will be SLURM as on Draco, the software environment as on Hydra and Draco.
Inastemp: a Vectorization Library to Accelerate C++ Codes

Berenger Bramas, Markus Rampp

Each new CPU generation is providing higher theoretical peak performance than its predecessor. While this growth was mainly sustained by the increase in CPU clock frequency in previous decades, the gain is nowadays mainly obtained by increasing the number of cores on a chip and by increasing the number of floating point values that the floating-point unit (FPU) can process in a single clock cycle. Such capability of a CPU to apply a single instruction to several values (in an array, say) is called ‘vectorization’, or SIMD for ‘single instruction, multiple data’. The widespread x86 CPU instruction set over time has been extended with vectorization instruction sets such as SSE, AVX, and AVX-512, where the latter has recently reached a SIMD register length of 512 bit. Together with a fused multiply-add (FMA) instruction, this allows to perform up to 32 double-precision (64 bit) floating point instructions per clock cycle. Provided sufficient data parallelism can be exposed in application code, compilers (in particular C, C++ and Fortran compilers such as Intel, GNU, PGI, ...) are able to automatically vectorize data-parallel regions of the code and to generate efficient machine code. In cases where this technique fails or does not exploit the vectorization capabilities to a sufficient degree, it has become common also in scientific applications to code performance-critical regions (‘kernels’) using so-called SIMD intrinsics, which are sets of relatively low-level and CPU-architecture specific instructions that compilers can efficiently map to vector instructions.

In order to bridge the gap between low-level SIMD intrinsics and high-level C++ application code, an open-source library, inastemp (‘intrinsics as template’), has been developed at the MPCDF which employs C++ templates to provide an abstraction layer for handling different sets of SIMD intrinsics. By this means, inastemp facilitates tapping the high SIMD performance of modern CPUs, especially in situations where conventional auto-vectorization by the compiler fails, without having to sacrifice portability and readability of the application code. The library supports basic arithmetic operations, a fast exponential function, optimized handling of if-branches, reductions, and an automatic detection of the CPU architecture at compile time and is released under a permissive license. Currently, inastemp supports all major x86 instruction sets up to the latest 512-bit wide AVX-512 used by the Intel Skylake CPU generation (the processor in the upcoming HPC system of the MPG), as well as IBM Power8 (128-bit wide VSX). Future CPU architectures will be added with time. For more details and code examples, the reader is referred to the publication of inastemp [1] and its software repository [2], from where the library can be downloaded. The development of inastemp was motivated by a joint project of the MPCDF with the MPI of Biophysics, and has been successfully applied to substantially speed up particle-particle interaction kernels [1] in a newly developed C++ code for coarse-grained simulations of biomolecular complexes.


News & Events

Raphael Ritz, Hermann Lederer

International HPC Summer School 2018

The next – ninth – ‘International Summer School on HPC Challenges in Computational Sciences’ (IHPCSS) is going to take place from July 8th to 13th, 2018 in Prague. The organization will be carried out by PRACE for Europe, XSEDE for the US, SCinet HPC Consortium Canada for Canada and RIKEN AICS for Japan. PhD students and postdocs who apply HPC for their research and are working in one of the scientific organizations in Europe, the US, Canada or Japan are eligible for applications; the selection is going to take place via a reviewing process. In the course of one week essential aspects of the challenges in different scientific disciplines will be addressed as well as state-of-the-art programming and visualization techniques and performance analysis. For the students participation including accommodation and meals will be free of charge. For details see http://www.ihpcss.org.

RDA Meeting in March 2018

Co-organized by the MPCDF the 11th Plenary Meeting of the Research Data Alliance (RDA) will take place from March 21st to 23rd, 2018 in Berlin, Germany. Under the theme ‘From Data to Knowledge’, the Plenary meeting welcomes the participation of all data scientists, experts and practitioners engaged in the advancement of data-driven science and economy. The RDA Plenary meeting provides ideal circumstances to discuss the opportunities and challenges of a global data ecosystem.