Bits & Bytes online Edition

The Munich-ATLAS-Tier2 project

Renate Dohmen, Christian Guggenberger

Collaborating research groups from the Max Planck Institute of Physics (MPP) and from the Ludwig Maximilian University (LMU) play a significant role in the ATLAS project of the Large-Hadron-Collider (LHC) experiment at the international European Centre for Particle Physics (CERN). In this experiment two beams of protons are accelerated to nearly light velocity and are focused towards each other to collide head-on. With each collision of two protons up to several thousands of new particles are generated. Various detectors will record the trajectories and energies of these generated particles. Huge amounts of data are expected, which have to be stored, processed and analyzed. The ATLAS detector, in particular, will provide data from 2008 on, and many institutes all over the world consider to work on these data. In order to make this feasible, a hierarchical network of so-called Tier centres is being established, by which the data is shared and replicated in a reasonable fashion and the workload is distributed to the centres.

The raw data remains at the Tier0 centre at Cern, Tier1 centres get preprocessed copies and distribute them to their Tier2 and Tier3 centres, where the simulations and analysis of the data are carried out. The German Tier1 centre is the GridKa facility at the FZ Karlsruhe. In Munich, a Tier2 centre was set up by the two physical institutes MPP and LMU together with the associated computing centres LRZ and RZG. The tasks of a Tier2 centre are to provide a certain amount of compute and memory resources, which are placed at the disposal of all sites taking part in the WLCG (world-wide LHC Computing Grid) community, and to establish sufficient memory bandwidth and services for the data exchange with the associated Tier1 centre. Furthermore, the Tier2 centre has to provide and support a so-called Grid-middleware system which enables all partners in the WLCG to access and use the mutual resources.

The hardware actually provided by the MPP and installed at the RZG consists of 126 worker nodes (approximately 500 CPU cores), several AFS file servers (7.2 TB non-migrating disk space and 25 TB migrating disk space) and 30 TB of dCache disk space for the efficient storage of large data amounts.

The LMU at present runs 37~worker nodes (approximately 150~CPU cores) and 10~dCache pools with 40~TB in total, which are hosted by the LRZ. Both partners plan to upgrade their hardware by about 450 CPU cores and 200 TB disk space, which will, however, not be dedicated exclusively to ATLAS, but also to other Grid activities which can be managed with the same infrastructure.

The Grid middleware used by the WLCG is gLite. Besides the worker nodes on which the applications are executed, there are several service machines:

    • the User Interface (UI), from which users submit their jobs to the Grid
    • the Compute Element (CE), which has the task to convert Grid jobs scheduled to the respective site into local batch jobs and to submit them
    • the Storage Element (SE), from which all data stored in the WLCG can be accessed via a common file catalog and data transfers are managed
    • the Monbox (MON), which delivers information on the available hardware, the actual load and the software installed at the respective site

gLite provides the services to accomplish the different tasks and to control the interaction between them. A sophisticated authorization and authentication mechanism is used to control the access to the diverse resources.

The main task of the RZG in the Munich-Tier2 centre is to operate and administrate the worker nodes, the service machines and the file systems and to install and maintain the Grid middleware for MPP/RZG, while for the physicists, the emphasis lies on installing application software and testing. But, of course, a close collaboration between all four partners is essential to get and keep the system running. The Munich-Tier2 centre is embedded in the Regional Operating Centre (ROC) DECH (Deutschland/Schweiz) which means a close contact to the associated Tier1 centre and the other Tier2 centres in Germany and Switzerland to share experiences and coordinate activities.