Ongoing CASUS Open Projects
(1) November 2019 to August 2022, extended to August 2023
Exploring a performance portable software stack for PIConGPU to target a next-generation computing system, the FRONTIER Exascale System at ORNL
The Frontier supercomputer is considered the world’s first exascale computer. In mid-2022, Frontier was able to demonstrate a total performance of 1.1 ExaFlops. That’s 1.1 quintillion floating-point operations per second (flops). This record catapulted Frontier to the top of the Top500 list of the world’s most powerful supercomputers. In this Open Project, the PIConGPU software package is being adapted for this computer so that simulations from the field of plasma physics run as efficiently as possible on Frontier. At the request of the University of Delaware, the project has been extended to 2023 at no cost.
This project has three major goals:
- Adapting the large-scale plasma simulation code PIConGPU to run on the DOE Exascale system Frontier
- Develop the Alpaka library for performance-portable many-core programming
- Develop the openPMD I/O ecosystem for Exascale use, especially in-memory workflows
(2) May 2020 to June 2023
Memory layout optimization and efficient interconversion of data structures for heterogeneous architectures
Partners: The European Organization for Nuclear Research CERN (Switzerland), Center for Information Services and High Performance Computing (ZIH) of Dresden University of Technology, Helmholtz-Zentrum Dresden-Rossendorf HZDR
Heterogeneous hardware architectures for complex parallel applications exhibit data throughput limitations that result in lower performances than theoretically accessible with optimal parallelism. Within this open project, a C++ library, called LLAMA (Low Level Abstraction of Memory Access), is developed to improve throughput by optimizing data access and movement. LLAMA will allow both a usable, minimal overhead memory layout description and an optimization of data layouts for various hardware architectures. The goal is to allow the user to describe a data structure suitable for their needs. As a first testbed, LLAMA will be used by ROOT, the big data analysis framework used by virtually all high energy physicists worldwide.
(3) November 2020 to October 2023
Computational methods for cell shapes and elastic materials
Elastic materials can form complex shapes that are hard to gasp computationally. In biology, for example, the mechanical behavior of the cell surface itself, as well as additional processes taking place on the surface result in dynamic shape deformations. To validate hypotheses about mechanisms influencing an elastic material’s shape, effective mathematics and numerical simulation methods are required. This open project aims to contribute them. An overarching theme of the planned work is to advance established calculation methods used for flat (one-, two-, and three-dimensional) spaces to make them work in curved (high-dimensional) spaces.
(4) November 2020 to October 2023
An optimal control approach to maximizing the benefits of limited testing capacity in an emerging pandemic
Partner: University of Maryland (USA)
Insufficient testing leaves public health authorities with little information on how to coordinate efforts to combat an emerging epidemic. Specifically, quick identification and isolation of new infection clusters is of critical importance. While there are recommendations that provide useful qualitative guidance when testing resources are limited, an optimized test allocation strategy is lacking despite its potential to increase testing efficiency. Within this project a series of mathematical disease models, so-called ordinary differential equation models, are constructed to analyze the influence of total testing capacity, information limitations, and other logistical constraints on optimal allocation strategies for flattening the infection curve and reducing mortality. The goal is to identify real-world parameter thresholds and transmission scenarios that determine the viability and optimality of resource allocation strategies. The results are expected to contribute to future testing policy guidelines.
(5) January 2021 to December 2023
A machine-learning inversion framework for materials under extreme conditions
The project’s main goal is to adapt the physics-informed neural networks framework for the inversion of Kohn-Sham equations, one of the world’s largest overall computational expenses, due to its prevalence in physical, biological, and materials sciences. More specifically, the proposed work has the potential to improve accuracy of low-cost electronic structure calculations. The results would have a major impact on the simulation of materials under high energy density (HED) conditions – one of the most challenging frontiers of plasma physics and materials science.
(6) August 2022 to July 2025
How joint variation in water quality and quantity affects riverine fish biodiversity in a changing world
The project aims to develop a more accurate system for predicting biodiversity loss in rivers due to global change and to gain an understanding of which aspects of water quality (temperature, oxygen, phosphorus, etc.) have the greatest impact on predictive accuracy. While predictive models for impacts due to changing water quantity in rivers already exist, they assume uniform water quality. Expanding the models to include water quality will allow for even better science-based devising of species protection and biodiversity conservation and restoration measures in the future.
(7) December 2022 to November 2025
HyperUAV-1: Scaling and spatial extrapolation of ultra-high resolution hyperspectral data
Many research fields heavily benefit from mapping applications. They help assessing vegetation diversity, ecosystem stress, crop health, geothermal heat flow, soil chemistry, or basement geology. Besides ground measurement data and airborne or satellite data, uncrewed aerial vehicles (UAV) have proven to bridge a critical scale gap. However, methods for integrating high-resolution UAV surveys with other scale datasets, and extrapolating this information across regions covered only by airborne or satellite data, are not available. This open projects aims to establish such methods when it comes to hyperspectral data. Among others, variations between sensor types and acquisition platforms make quantitative comparisons challenging. This project investigates certain occurring effects in the field of spectral mixing with the aim to quantify the influence of scale on spectral measurements and associated classifications. Finally, the resolution of low-resolution satellite or airborne data will be enhanced by using generative deep learning methods developed within this open project.
(8) December 2022 to November 2025
HyperUAV-2: Scaling hyperspectral data to meaningfully quantify essential ecosystem variables
Successful ecosystem management means making the right decisions as early as possible. To do so, managers could benefit from data that allow direct quantification of disturbances resulting from pest, drought stress, flood, fire or invasive species. This open project is about advancing the early detection of such events and the correct judgement of their impact on ecosystem function across spatial and temporal scales. Hyperspectral data gained from unmanned aerial vehicles (UAV) are considered key for a successful extrapolation of continuous ground or tower measurements over entire ecosystems. Within this project, the machine learning approaches developed and outlined in the open project “HyperUAV-1” (see above) will be utilized to develop and validate data-driven models for essential ecosystem variables that integrate (1) continuous ground or tower measurements, (2) high-resolution but spatially limited UAV surveys and (3) lower resolution but large-extent airborne or satellite data.
(9) Januar 2023 to December 2025
AutoTarget – Autonomous multi-UAV (unmanned aerial vehicle) for the characterization of remote and isolated targets
Drones are now used in many scientific disciplines for data acquisition. Often, there are trade-offs between duration of use, data quality, overflown area, and other constraints. The Open Project AutoTarget aims to combine the advantages of small systems (long flight duration, large overflown area) with those of large systems (more and better data). The solution could be a system of drones of different designs traveling together. Lightweight overview drones and heavier drones, fully loaded with sensors, would need to communicate perfectly to coordinate flight paths and determine exactly where to collect what data. The Open Project will develop the appropriate software and adapt the unmanned aerial systems in coordination with the hardware specifications for the use case of remote area ground survey.
(10) Januar 2023 to December 2025
AI-based decision support for treatment intervention and treatment clearance within an online adaptive proton therapy workflow
In the field of cancer radiotherapy, there has been recent progress concerning the online adaptation of treatments. The success concerned photon-based radiotherapy. If a comparable success could be achieved in proton therapy, this would significantly improve the therapy of many affected patients. The project goal is to advance the development of a fully automated, artificial intelligence (AI)-based clinical decision support system capable of interfacing with the treatment planning system, the treatment control system, and the oncology information system. Clinical implementation of such online-adapted proton therapy requires efficient and secure solutions for the various tasks in the feedback loop. Thus, there should be both a direct and fast feedback to the clinical staff and a possibility for efficient and convenient retrospective review of the automated decisions.
(11) March 2023 to February 2026
Using Natural Language Processing to learn the language rules of genomes
Partner: Dresden University of Technology
Machine learning (ML) methods are successfully used within genomics for numerous classification tasks. While ML has shown to be effective using nucleic acid sequences or statistical representations as input, natural language processing (NLP) techniques promise to leverage new and different insights in genomics. The recent progress of NLP techniques allows to approach language rules such as the relative position of specific motifs also within nucleic acid sequences. Within this project task-agnostic language models of the human genome and one specific virome, the SARS-CoV-2 genome, will be developed. Afterwards, both models’ grammar and syntax will be analyzed to find out if relevant (known) patterns of genomic elements are identified. This will result in fine-tuned models that will finally be used to address open biological questions, e g. from the research fields of human genome stability or mutagenic event prediction in viruses.