CASUS involved in development of the world’s largest supercomputer Frontier

For one year now, researchers from CASUS, the Center for Advanced Systems Understanding at the Helmholtz-Zentrum Dresden-Rossendorf (HZDR), have supported the work on the world’s largest planned exascale computer Frontier. The collaboration with the University of Delaware is now yielding its first results: the researchers successfully tested their PIConGPU simulation software on one of the world’s fastest graphics cards for high-performance computers – the recently released MI100 from AMD.

The construction of new supercomputers and the introduction of new hardware such as state-of-the-art graphics cards usually means that software tailored to the respective model has to be written. In close collaboration with physicists and computer scientists at the HZDR led by Dr. Alexander Debus, Dr. Thomas Kluge and Dr. Guido Juckeland, Dr. Michael Bussmann’s CASUS team has developed solutions using standardized computer codes that work efficiently on almost all high-performance computers, independent of the hardware used. With the help of a uniform library, the codes, which are often several hundred thousand lines long, do not have to be rewritten each time.

This reduces the amount of programming work and minimizes the possibility of errors considerably. “With our applications, we attracted attention from US development labs – so much that we were given access to prototype exascale computers. This preferential treatment was denied to most of our competitors. We are delighted to have been invited, together with the University of Delaware, as one of only eight teams to test and adapt our software on the new generation of supercomputers,” says CASUS team leader Bussmann, describing how the collaboration came about.

With alpaka on the way to uniform codes

Especially for the field of plasma and laser physics, the teams from HZDR and CASUS have designed PIConGPU, an extremely versatile simulation software. PIConGPU is the abbreviation of Particle-In-Cell on Graphics Processing Unit. The code is to be used, for example, in the development of particle accelerators for radiation therapy of cancer, in high-energy physics or in research with photons. To ensure that the simulation code runs on different types of hardware without having to be constantly adapted, the researchers use alpaka.

This program library, which is currently being further developed at CASUS for exascale applications, makes it possible to write software only once and then run it efficiently on a wide variety of hardware systems. alpaka has already been used successfully on other supercomputers, for example IBM’s Summit, currently the most powerful high-performance computer in the USA, but also the Taurus Cluster of the Centre for Information Services and High-Performance Computing at TU Dresden and the HZDR system Hemera.

One of the next key steps is to adapt PIConGPU to the future US exascale system Frontier. CASUS scientist Jan Stephan is therefore working on a configuration of alpaka on next-generation hardware: “Specifically, we want to be able to use graphics processors and programmable logic circuits from companies such as AMD, Intel and Xilinx in the future,” says the computer scientist, envisaging the next immediate goal.

“We have made great progress on this project. We got access to the recently released AMD Instinct MI100 graphics cards and were able to get the full PIConGPU application running on them. In doing so, we observed a 1.4x speed-up compared to the previous card, the MI60. This is promising and we look forward to extending the scientific capabilities of PIConGPU with next-generation AMD GPUs and the upcoming Frontier Exascale system at Oak Ridge National Laboratory,” says project leader Prof. Sunita Chandrasekaran of the University of Delaware. Physicist Dr. Alexander Debus from Institute of Radiation Physics at HZDR adds: “Frontier will allow us to study the complex plasma dynamics with unprecedented temporal and spatial resolution. This will significantly advance the development of plasma accelerators.”

Data standard for more targeted analysis

In a second project at Frontier, Franz Pöschel from CASUS is developing openPMD, an open data standard. This aims to make the enormous amount of scientific data that plasma simulations will deliver at most effective and fast accessible. PMD stands for Particle Mesh Data: data that is often generated in physical simulations. With openPMD, the extensive simulation data from plasma physics can be stored efficiently and quickly. In addition, openPMD supports the further data usage according to the so-called FAIR principle of open scientific data (findable, accessible, interoperable, reusable). On Frontier, the scientists now want to explore how they can use openPMD to visualize and analyze huge amounts of data in the shortest possible time.

Frontier – next-generation supercomputer

In 2019, the US Department of Energy announced the construction of the Frontier supercomputer at Oak Ridge National Laboratory. Frontier is expected to be operational in 2022, when it will be able to handle one and a half quintillion floating-point computing operations per second (a quintillion consists of a one, followed by 18 zeros). With these 1.5 ExaFLOPS, it would then be the most powerful computer in the world. The Frontier Center for Accelerated Application Readiness (CAAR), which was founded specifically to realize this task, has selected eight research projects to accompany the construction of the Frontier high-performance computer and its applications – including the PIConGPU team of the University of Delaware in cooperation with HZDR and CASUS.

Exascale computing is a significant milestone in computer technology. Large-scale and fast simulations that were unimaginable just a few years ago are now possible with Frontier’s enormous computing resources. The first prototypes for future exascale systems are also being built in Europe, such as the Juwels booster at the Jülich Research Centre. Various branches of research hope that this will lead to improved applications, for example in medicine, in radiation treatment of cancer, virus research or the fight against Alzheimer’s disease. Such computing capacities are also of utmost importance for the study of climate change or the prediction of ecological influences on biodiversity.

The Center for Advanced Systems Understanding (CASUS) was founded 2019 in Görlitz and is operating data-intensive interdisciplinary systems research in such diverse disciplines as Earth system research, systems biology or materials research. The goal of CASUS is to create digital images of complex systems of unprecedented fidelity to reality with innovative methods from mathematics, theoretical systems research, simulations and data and computer science to give answers to urgent societal questions.

Partners are the Helmholtz Centre Dresden-Rossendorf (HZDR), the Helmholtz Centre for Environmental Research in Leipzig (UFZ), the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden (MPI-CBG), the Technical University of Dresden (TUD) and the Wroclaw University.

The Center is funded by the Federal Ministry of Education and Research and the Saxon State Ministry of Science, Culture and Tourism.

The exascale-class Frontier supercomputer in the USA can answer previously unsolved questions in plasma physics. A team of researchers at the HZDR and CASUS has already been able to successfully apply the first codes. © OLCF