Summer School on HPC Driving AI

The rapid progress in AI increasingly requires computing resources far beyond conventional workstations. High Performance Computing (HPC) forms the backbone of modern data- and compute-intensive applications. Combining HPC expertise with AI methods opens entirely new possibilities in science and technology, particularly in: Advanced data analytics, Physical simulations, Stochastic modeling and uncertainty quantification, Data-driven applications.

The Summer School brings together leading HPC expertise with innovative approaches in AI, data analytics, physical simulations, and stochastic modeling.

A consortium comprising the Technical University of Munich, ETH Zürich Campus Heilbronn, the Jülich Supercomputing Centre, HLRS Stuttgart, the Swiss National Computing Center (CSCS) and the Leibniz Supercomputing Centre is organizing and hosting the first joint interdisciplinary summer school on advanced high-performance computing (HPC) techniques for machine learning, data-driven models, and physics-based simulations.

Over five days of intensive sessions, workshops, and networking opportunities, participants will learn how cutting-edge HPC technologies enable the next generation of data-driven applications—from scalable AI models to high-precision simulations of complex systems. They will gain insights into ongoing research projects, apply methods in practical exercises, and network with leading researchers and the regional innovation ecosystem in Heilbronn—a center for AI, digitalization, and high-performance computing.

Jens Domke (RIKEN, Japan)
Valentin Churavy (University of Augsburg, Germany)
Thorsten Kurth (Nvidia, Switzerland)
Fernanda Foertter (University of Alabama, USA)
Hatem Ltaief (KAUST, Saudi-Arabia)
Monica Dessole (CERN, Switzerland)
Marius Kurz (AMD, Germany)
Johannes Gebert (HLRS)
Mathis Bode (Jülich Supercomputing Centre)

Kick-off social event: Get to know trainers, speakers, and other participants in a relaxed atmosphere.
GPU programming & exascale hands-on: Intensive course with direct access to the JUPITER supercomputer.
Smart breakfast sessions: Short, interactive presentations on research topics and productivity in the HPC environment.
Conference Day: International guest speakers present cutting-edge topics from HPC, AI, and their convergence.
Public Evening Lecture: Exciting insights into socially relevant applications of HPC and AI.
Focused Specialist Modules: Containerization, AI on a large scale, future technologies, physics simulations, and HPC visualization.

Tuesday (June 16)

5:00 – 8:00 p.m.

Welcome and get-together with standing dinner

Wednesday (June 17)


7:00 – 7:30 a.m.	Morning Workout

8:30 – 9:15 a.m.	Morning Keynote	Joachim Buhmann ETH Zürich

9:15 a.m. – 12:15 p.m.	From Scratch to Silicon: Tracking a Transformer through the JAX & XLA Compiler Pipeline. While modern machine learning frameworks decouple model design from hardware complexity, achieving peak performance on modern accelerators requires understanding the compiler pipeline that translates symbolic graphs into machine instructions. Led by a Google GPU compiler engineer, this hands-on, three-hour workshop bridges the gap between high-level model architectures and bare-metal GPU execution. Using a character-level GPT transformer as a concrete vehicle, participants will implement and trace a model’s journey from Python code down to physical silicon. Utilizing pre-trained weights to bypass training latency, we will leverage JAX and Google Colab to systematically analyze each stage of the compilation pipeline. Key Topics Covered: JAX Tracing: Lowering Python code to JAX’s typed, intermediate representation (Jaxpr). StableHLO & HLO IR: Analyzing how high-level operations map to portable, hardware-independent tensor algebra. XLA Optimization Passes: Deconstructing algebraic simplifications and critical loop fusion passes that accelerate scale workloads like Gemini, AlphaFold, and Waymo. GPU Codegen: Examining the final compiled low-level GPU assembly (PTX/LLVM IR) to see how the compiler maps memory and threads onto silicon. By the end of this session, participants will gain a systems-level perspective on deep learning compilers, equipping them to design models that naturally align with hardware-specific optimizations.	Dirk Hornung Dirk Hornung is a GPU Compiler Engineer at Google, where he works on the XLA (Accelerated Linear Algebra) compiler ecosystem. Before diving into silicon-level optimization, Dirk earned his PhD in Physics from the Universitat Autònoma de Barcelona, focusing on precision determinations of Standard Model parameters. Today, he works at the cutting edge of hardware acceleration, designing compiler code-generation and performance optimizations that power Google's largest AI applications, including Gemini, AlphaFold, and Waymo's autonomous driving software. Dirk is passionate about bridging the gap between high-level AI model architectures and highly efficient, bare-metal GPU execution.

12:15 – 1:00 p.m.	lunch

1:00 – 4:00 p.m.	Microarchitectures and Future Computing Technologies 1:00 – 1:30: Introduction to Modern HPC Architectures 1:30 – 2:00: Panel on Future HPC Architectures 2:00 – 4:00: Hands On Session on OpenMP	Michael Klemm Dr. Michael Klemm is a Principal Member of Technical Staff in the Compilers, Languages, Runtimes & Tools team of the Machine Learning & Software Engineering group at AMD. He is the lead architect of the AMD Fortran Compiler for AMD GPUs. Michael also is the Chief Executive Officer of the OpenMP Architecture Review Board and a lecturer at the Chair of Computer Architecture and Parallel Systems at Technical University Munich. He holds an M.Sc. in Computer Science and a Doctor of Engineering degree (Dr.-Ing.) in Computer Science from the Friedrich-Alexander-University Erlangen-Nuremberg, Germany. Michael's research focus is on compilers and runtime optimizations for distributed systems. His areas of interest include compiler construction, design of programming languages, parallel programming, and performance analysis and tuning. Amir Raoofy Amir is a researcher working in the Future Computing and Energy Efficiency group at the Leibniz Supercomputing Centre, and is affiliated with Technical University of Munich (TUM) as a postdoctoral researcher. He earned his PhD (Dr. rer. nat.) in 2024 from the Chair of Computer Architecture and Parallel Systems at TUM, where his research focused on leveraging high-performance computing (HPC) for large-scale data analysis. His research interests span system software and programming, high-performance and smart networking in HPC, architectural evaluations and co-design as well as programming models and usage paradigms.

4:00 – 4:30 p.m.	coffee break

4:30 – 7:00 p.m.	Distributed GPU Computing Tutorial	Mathis Bode Jülich Supercomputing Centre Dr. Mathis Bode is a leading researcher at the Jülich Supercomputing Centre (JSC) and RWTH Aachen University, specializing in the intersection of extreme-scale artificial intelligence, high-performance computing (HPC), and computational fluid dynamics. He heads major initiatives at the frontier of exascale computing, serving as the executive coordinator of the JUPITER AI Factory (JAIF) and leading the JUPITER Research and Early Access Program (JUREAP). Dr. Bode’s research focuses on developing GPU-accelerated solvers and deep learning frameworks—such as Physics-Informed Enhanced Super-Resolution Generative Adversarial Networks (PIESRGANs) to simulate highly complex multi-phase flows, turbulent combustion, and sustainable energy systems like hydrogen flames.

Thursday (June 18)


7:00 – 7:30 a.m.	Morning Workout

8:30 – 9:15 a.m.	HPC in the Era of AI Datacenters - Are we still doing the right thing?	Prof. Dr. Hartwig Anzt Professor für High Performance Computing in Artificial Intelligence TUM Campus Heilbronn

9:15 – 10:00 a.m.	Inside an LLM: A hands-on Performance Deep Dive with TinyLlama This talk introduces the key computational building blocks of Large Language Models and transformer architectures, focusing on their performance behavior and common bottlenecks. Using a small reference model, we walk through a practical approach to profiling AI and HPC workloads on modern GPUs and demonstrate how to identify and address typical performance limitations through optimizations that also underpin current large-scale transformer architectures.	Marius Kurz Marius Kurz is an MTS Software Applications Engineer at AMD with a background in CFD and converged HPC/AI. He supports and optimizes large-scale HPC and AI workloads on AMD GPUs, focusing on performance profiling, optimization, and scalability.

10:00 – 10:30 a.m.	Coffee Break

10:30 – 11:15 a.m.	ROOTing performance-portable heterogeneous computing with SYCL Modern scientific data analyses increasingly rely on heterogeneous computing facilities-CPUs, GPUs, and emerging accelerators-to meet the ever growing performance demand. This is especially the case for high-energy physics analyses, as the upcoming High-Luminosity Large Hadron Collider (LH-LHC) will bring unprecedented data volumes to process. To keep pace with this scale, long-lived software ecosystems like ROOT must evolve without fragmenting the codebase or burdening users with device-specific complexity. This talk presents our experience in integrating SYCL into the ROOT library in the broader context of the SYCLOPS European project to enable performance-portable heterogeneous execution for core workflows, including large-scale data preprocessing and hybrid physics-AI pipelines. We discuss the design principles that guided this effort, the practical challenges encountered and how they were addressed via concrete examples from the ROOT components. By sharing lessons learned and performance results, this talk aims to demonstrate how SYCL can serve as a viable and sustainable path for modernizing large scientific software frameworks to fully exploit current and future computing platforms.	Monica Dessole Monica Dessole is a visiting researcher at the NATO Center for Maritime Research and Experimentation (CMRE) specializing in high-performance and heterogeneous computing for numerical modelling. Previously, she was a research fellow at CERN, contributing to the ROOT framework with a focus on performance-portable C++ abstractions and SYCL-based acceleration for high-energy physics applications. She holds a PhD in Computational Mathematics from University of Padua, where she investigated linear algebra algorithms for parallel architectures.

11:15 – 12:00 a.m.	Morning Keynote: Right Precision, Right Place: Rethinking HPC Applications with Adaptive Mixed Precision The future of large-scale simulations is increasingly tied to hardware features originally designed for AI workloads—especially low-precision arithmetic. Modern GPUs embody this shift, delivering substantial speedups through reduced-precision computations that lower execution time, shrink memory footprints, and cut energy consumption. Building on these capabilities, we design fast mixed-precision linear algebra algorithms that adaptively choose the right precision at the right moment. Beyond computation, we extend this adaptivity to data storage, enabling mixed-precision representations that dynamically adjust to the needs of memory-bound applications. This approach significantly reduces data movement—now a dominant performance bottleneck—while preserving high accuracy only where it truly matters. Our dynamic precision-conversion strategy maintains application-level numerical reliability while improving overall efficiency. This talk will demonstrate how these techniques reshape computational performance for geospatial statisticians and geophysicists, with far-reaching benefits for environmental computational statistics, seismic imaging, and beyond.	Hatem Ltaief King Abdullah University Saudi Arabia Hatem Ltaief is a Principal Research Scientist in the Computer Electrical and Mathematical Sciences and Engineering Division at KAUST. His research focuses on mixed-precision algorithms, low-rank matrix computations, parallel programming models, and performance optimizations for high-performance computing (HPC) systems equipped with hardware accelerators. Hatem has co-authored all four of KAUST Gordon Bell finalist papers since 2022. In November 2024, he received the prestigious ACM Gordon Bell Prize (shared) in climate modeling for his contributions to developing an exascale climate emulator. He currently serves as co-Editor-in-Chief of the ACM Transactions on Mathematical Software and as an Associate Editor-in-Chief of the Elsevier Parallel Computing Journal.

12:00 – 1:00 p.m.	lunch

1:00 – 1:45 p.m.	The Infinite Loop: Why HPC is the Bedrock and Future of Artificial Intelligence Modern AI did not emerge in a vacuum; it was forged in the fires of High-Performance Computing (HPC). From the foundational math libraries that underpin every neural network to the massive parallelization strategies originally developed for physical simulations, the DNA of HPC is woven into every successful model today. This talk explores how HPC’s expertise in scalability, interconnects, and memory optimization is the only way forward as AI reaches trillion-parameter scales. However, the relationship is now a two-way street: we will conclude with the "twist" of how AI is revitalizing its progenitor by accelerating traditional simulations through neural surrogates and automating the management of the world’s most complex infrastructure.	Fernanda Foertter Executive Director, High Performance Computing and Data Center Division for Research The University of Alabama

1:45 – 2:30 p.m.	Differentiable Simulations Physics-based simulation has long been a cornerstone in domains such as engineering, energy systems, fluid dynamics, and materials design, where first-principles models provide interpretability, reliability, and strong inductive structure. At the same time, recent years have seen a growing trend toward AI-based surrogate models that approximate complex physical processes with greatly reduced computational cost, enabling faster prediction, control, and optimization. Differentiable simulation brings these two paradigms together: it combines the structure and domain knowledge of traditional simulators with gradient-based learning and optimization techniques from modern machine learning. By making simulations differentiable end-to-end, one can calibrate model parameters, solve inverse problems, learn control strategies, and embed physical models directly into training pipelines. This lecture will introduce the fundamentals of differentiable simulations, discuss practical implementation aspects in PyTorch, and address important conceptual challenges, including the use of surrogate functions for equations or operators that are not naturally differentiable. Throughout the lecture, we will use optimization problems in concentrating solar power plants as a guiding example, illustrating how differentiable simulation can support the design, operation, and control of complex energy systems.	Markus Götz Affiliation Karlsruhe Institute of Technology

2:30 – 3:15 p.m.	Automatic Differentiation for Scientific Computing Scientific computing requires both an HPC-first approach and the integration of modern techniques like machine learning. Yet, we want to integrate ML into scientific computing, rather than replace it whole cloth. To train ML approaches embedded in models, requires us to obtain gradients via automatic differentiation from those models, and thus through HPC paradigms like MPI and CUDA.	Valentin Churavy University of Augsburg

3:15 – 3:30 p.m.	Coffee Break

3:30 – 6:00 p.m.	Distributed GPU Computing Tutorial	Mathis Bode Jülich Supercomputing Centre Dr. Mathis Bode is a leading researcher at the Jülich Supercomputing Centre (JSC) and RWTH Aachen University, specializing in the intersection of extreme-scale artificial intelligence, high-performance computing (HPC), and computational fluid dynamics. He heads major initiatives at the frontier of exascale computing, serving as the executive coordinator of the JUPITER AI Factory (JAIF) and leading the JUPITER Research and Early Access Program (JUREAP). Dr. Bode’s research focuses on developing GPU-accelerated solvers and deep learning frameworks—such as Physics-Informed Enhanced Super-Resolution Generative Adversarial Networks (PIESRGANs) to simulate highly complex multi-phase flows, turbulent combustion, and sustainable energy systems like hydrogen flames.

8:00 – 9:00 p.m.	Public Talk: From Simulation to Learning: AI and HPC for Weather and Climate Prediction Recent advances in machine learning are fundamentally reshaping weather and climate prediction. Data-driven models such as FourCastNet, GenCast, AIFS, and other emerging generative approaches now rival or complement traditional numerical weather prediction (NWP) by delivering faster forecasts while capturing complex atmospheric dynamics. At the same time, initiatives such as WeatherGenerator and NVIDIA Earth-2 highlight a broader vision: building AI-powered digital twins of the Earth system that integrate weather, climate, and environmental processes at unprecedented scale. This talk explores the intersection of these developments with high-performance computing (HPC). AI workloads are reshaping both software and hardware stacks on modern supercomputers: accelerator-centric architectures, mixed-precision training, distributed deep learning frameworks, and scalable I/O pipelines for massive geophysical datasets are becoming central design considerations. Conversely, HPC expertise remains critical for scaling training and inference, integrating hybrid AI–physics workflows, and enabling robust deployment. We will discuss implementation aspects and practical challenges using FourCastNet3 and transformer-based models as concrete examples. In particular, we examine architectural trends, scaling behavior, and the impact of modern GPU features on scientific AI workloads. Finally, we outline open challenges and future directions, including coupling AI models with observational data, extending methods toward climate timescales, and improving physical consistency and reliability. The convergence of AI and HPC is not only accelerating weather prediction but also redefining how Earth system science is conducted.	Thorsten Kurth Thorsten works at NVIDIA, where he focuses on optimizing scientific applications for GPU-based supercomputers. His work centers on developing high-performance deep learning solutions for HPC systems, including end-to-end optimizations such as input pipelines, I/O tuning, and distributed training leveraging data parallelism alongside model parallelism across multiple dimensions. In 2018, he was awarded the Gordon Bell Prize for leading the first deep learning application to surpass 1 ExaOp peak performance on the OLCF Summit system. In 2020, he received the Gordon Bell Special Prize for HPC-based COVID-19 research, recognizing his work on generating large ensembles of spike trimer conformations using the AI-driven molecular dynamics workflow DeepDriveMD. More recently, his work has focused on generative weather forecasting, contributing to the development of the FourCastNet model.

Friday (June 19)


7:00 – 7:30 a.m.	Morning Workout

8:30 – 9:15 a.m.	RIKEN, R-CCS and Fugaku(NEXT): The World’s Top Supercomputers for Tackling SDGs Challenges with AI This talk explores RIKEN's roadmap using Fugaku and future FugakuNEXT to tackle global SDGs and Japan's pressing scientific challenges in disaster prediction, healthcare, and carbon neutrality. We introduce RIKEN's Center for Computational Science (R-CCS) and Fugaku's unique Arm-SVE architecture and our plans for large-scale AI-HPC convergence. We cover topics such as porting massive AI models to Fugaku and our path towards a full-spectrum AI platform, enabling efficient model serving, real-time inference, and surrogate modeling at scale. Finally, we unveil FugakuNEXT's architecture, co-designed to serve the HPC and AI inference demands of Japan's science community in 4 years.	Jens Domke Jens Domke is the Team Principal of the Supercomputing Performance Research Team at the RIKEN Center for Computational Science (R-CCS), Japan. He received his doctoral degree from the Technische Universität Dresden, Germany, in 2017 for his work on HPC routing algorithms and interconnects. Jens started his career in HPC in 2008, after he and a team of five students of the TU Dresden and Indiana University, won the Student Cluster Competition at SC08. Since then, he published dozens of peer-reviewed journal and conference articles. Jens contributed the DFSSSP and Nue routing algorithms to the subnet manager of InfiniBand, and built the first large-scale HyperX prototype at the Tokyo Institute of Technology. His research interests include system co-design, performance evaluation, extrapolation, and modelling, interconnect networks, and optimization of parallel applications and architectures.

9:15 a.m. – 12:15 p.m.	HPC-Driven Visualization (Room D1.08) Visualization is a key enabler for the effective analysis and interpretation of large-scale simulation data produced by high-performance computing (HPC) systems. In particular, immersive and interactive visualization approaches, including virtual reality (VR), offer new opportunities to understand complex, high-dimensional datasets that are increasingly common in computational science and engineering. This three-hour summer school module introduces participants to state-of-the-art visualization workflows for HPC environments, with a focus on immersive visualization and scalable, modular frameworks. The course begins with an overview of the role of virtual reality in HPC visualization, covering fundamental concepts, typical use cases, and performance, data management, and user interaction challenges. Participants are then guided through the setup of containerized visualization environments using Docker, enabling reproducible and portable deployment of complex visualization software stacks. The core of the module is a hands-on session dedicated to the Covise and Vistle visualization frameworks, developed by HLRS, which are widely used for interactive, distributed, and immersive visualization of large simulation datasets. Through practical exercises, participants gain experience in configuring visualization pipelines, connecting to HPC data sources, and exploring data interactively in both desktop and immersive settings. The module targets graduate students and early-career researchers and assumes a readily available Docker environment. By the end of the session, participants will be equipped with practical knowledge and tools to integrate advanced visualization techniques into HPC-based research workflows. 9:00 – 9:30: Introduction into Virtual Reality for HPC 9:30 – 10:00: Setup of the Docker Containers 10:00 – 10:10: Coffee Break 10:10 – 12:00: Hands On Session for Covise/Vistle Communication at Scale (Room D2.01)	Susanne Malheiros, Martin Aumüller, Johannes Gebert HLRS Hussein Harake, Andreas Joksch CSCS

12:15 – 1:00 p.m.	Lunch

1:00 – 4:00 p.m.	Mathis Bode: JUPITER: Driving Extreme-Scale AI and Simulation in Europe JUPITER is the first European exascale supercomputer and is operated at the Jülich Supercomputing Centre (JSC) at Forschungszentrum Jülich. With approximately 24,000 NVIDIA GH200 superchips, JUPITER provides a unique environment for extreme-scale AI and simulation. This presentation provides a detailed overview of JUPITER, describes its operating model, and summarizes initial scientific success stories. It also describes how JUPITER is being expanded into an AI-optimized computing platform as part of the JUPITER AI Factory (JAIF). Essential to this, in addition to numerous AI services, is the integrated inference module JARVIS, which will enable the secure, sovereign, and scalable deployment of AI applications. Jacob Finkenrath: PLEGMA with QUDA: A look into the structure of the nucleons Large scale computing can be used as a microscope to get a closer look into the structure of hadrons. Supercomputer allows us to simulate the strong interaction, the nuclear force which glues together quarks and gluons within hadrons. Properties of protons and neutrons, which are the constituens of the cores of all atoms, can be simulated via lattice QCD, a discretized Quantum Field Theory. Calculations are becoming more precise as finer the discretization, the lattice spacing, becomes. For our newest measurements at our so far finest lattice spacing, we utilized the EAP of Jupiter. Within the talk, I will discuss our computational challenges, how Jupiter will help us to perform the next steps to deepen our understanding of the nuclear structure.	Mathis Bode Jülich Supercomputing Centre Dr. Mathis Bode is a leading researcher at the Jülich Supercomputing Centre (JSC) and RWTH Aachen University, specializing in the intersection of extreme-scale artificial intelligence, high-performance computing (HPC), and computational fluid dynamics. He heads major initiatives at the frontier of exascale computing, serving as the executive coordinator of the JUPITER AI Factory (JAIF) and leading the JUPITER Research and Early Access Program (JUREAP). Dr. Bode’s research focuses on developing GPU-accelerated solvers and deep learning frameworks—such as Physics-Informed Enhanced Super-Resolution Generative Adversarial Networks (PIESRGANs) to simulate highly complex multi-phase flows, turbulent combustion, and sustainable energy systems like hydrogen flames. Jacob Finkenrath University of Wuppertal Dr. Jacob Finkenrath is an ERC Research Group Leader in the Theoretical Particle Physics department at the Bergische Universität Wuppertal. After earning his PhD from Wuppertal in 2015, he spent seven years at the Cyprus Institute advancing high-performance computing strategies, followed by a prestigious tenure as a Staff Scientist at CERN. In 2025, he returned to Wuppertal to head the LEEX project ("Lattice QCD simulations at the dawn of European Exascale Computing"), a pioneering initiative funded by an ERC Consolidator Grant. His research sits at the intersection of theoretical physics and computational science, focusing on developing cutting-edge algorithms and utilizing exascale supercomputers to simulate Lattice Quantum Chromodynamics (QCD).

1:00 – 4:00 p.m.	GPU-Centric Supercomputing on Jupiter Hendrik Nicolai: NekCRF: Capturing Turbulent Hydrogen Combustion in Gas Turbine Combustors at Exascale Hydrogen and other renewable fuels exhibit fundamentally different combustion behavior, revealing critical limitations in existing models that are not suitable for high-fidelity simulations of reactive flows. Addressing these challenges requires the development of new, physically grounded combustion models. At the same time, evolving HPC architectures demand a new generation of CFD software optimized for massively parallel, GPU-centric platforms in order to fully exploit emerging computational capabilities. Leveraging the novel GPU-accelerated spectral element solver NekCRF, we perform large-scale direct numerical simulations (DNS) with finite-rate chemistry on modern heterogeneous HPC architectures. The solver is specifically designed for exascale systems, with all components including flow solver, transport operators, and in particular the thermo-chemistry integration fully optimized for efficient execution on GPUs. NekCRF employs high-order spectral element discretizations combined with GPU-native algorithms to maximize arithmetic intensity and minimize memory movement, ensuring excellent scalability across large numbers of accelerators. Special attention is given to the treatment of stiff chemical source terms, where tailored data layouts, kernel fusion, and optimized integration strategies enable efficient utilization of GPU resources without compromising numerical accuracy. The resulting simulations serve as high-fidelity benchmark datasets for the development and validation of next-generation numerical methods and models. More broadly, this work demonstrates how tightly integrated algorithmic design and hardware-aware implementation can significantly advance the state-of-the-art in large-scale scientific computing, enabling simulations at resolutions and fidelities that were previously out of reach. Jan Frederik Engels: ICON: Efficiently Simulating the Full Earth System at 1km Simulating the full Earth system demands an explicit representation of the tightly coupled flows of energy, water, and carbon across all dynamically relevant spatial and temporal scales. In particular, capturing the energy and water cycles requires resolving convective instability which implies kilometer-scale resolution, with grid spacings on the order of 1 km to represent these processes explicitly rather than parametrically. Meeting this challenge involves orchestrating multiple strongly coupled model components, each with distinct computational patterns and performance requirements. At the same time, modern high-performance computing (HPC) systems offer increasingly heterogeneous architectures, combining diverse compute units with varying memory-access characteristics. This talk highlights how ICON can be efficiently mapped onto next-generation architectures such as JUPITER, Europe’s first exascale system, and presents first results that demonstrate both the feasibility of global kilometer-scale simulations and an unprecedented level of computational throughput.	Hendrik Nicolai Technical University Darmstadt Prof. Dr.-Ing. Hendrik Nicolai is a prominent researcher and junior professor specializing in the Simulation of reactive Thermo-Fluid Systems (STFS) within the Department of Mechanical Engineering at the Technische Universität Darmstadt. His research sits at the cutting edge of computational fluid dynamics (CFD) and high-performance computing (HPC), where he focuses on the modeling and numerical investigation of laminar and turbulent reacting flows. Dr. Nicolai’s work is highly instrumental in the green energy transition; his recent projects include evaluating zero-carbon fuels like hydrogen-ammonia mixtures, exploring iron dust/air combustion via the Clean Circles research initiative, and optimizing exascale GPU-accelerated solvers to predict soot and pollutant emissions in next-generation aircraft engines. Jan Frederik Engels German Climate Research Center (DKRZ) Dr. Jan Frederik Engels is a distinguished computational physicist with over 15 years of experience in high-performance computing (HPC) and climate modeling. He currently serves as a senior researcher and developer at the German Climate Computing Center (DKRZ) in Hamburg, where he has spent nearly a decade optimizing and scaling the ICOsahedral Non-hydrostatic (ICON) unified weather and climate model. Dr. Engels’s pioneering work bridges astrophysics—having earned his doctorate at the University of Göttingen researching numerical simulations of cosmic structure and dark matter—with exascale atmospheric science. He was recently recognized as a co-recipient of the prestigious ACM Gordon Bell Prize for Climate Modelling for his team's landmark achievement in executing the first-ever global, full Earth system simulation at an unprecedented 1-km resolution.

4:00 – 4:30 p.m.	Coffee Break

4:30 – 7:00 p.m.	Distributed GPU Computing Tutorial	Mathis Bode Jülich Supercomputing Centre Dr. Mathis Bode is a leading researcher at the Jülich Supercomputing Centre (JSC) and RWTH Aachen University, specializing in the intersection of extreme-scale artificial intelligence, high-performance computing (HPC), and computational fluid dynamics. He heads major initiatives at the frontier of exascale computing, serving as the executive coordinator of the JUPITER AI Factory (JAIF) and leading the JUPITER Research and Early Access Program (JUREAP). Dr. Bode’s research focuses on developing GPU-accelerated solvers and deep learning frameworks—such as Physics-Informed Enhanced Super-Resolution Generative Adversarial Networks (PIESRGANs) to simulate highly complex multi-phase flows, turbulent combustion, and sustainable energy systems like hydrogen flames.

Saturday (June 20)


7:00 – 7:30 a.m.	Morning Workout

8:30 – 9:15 a.m.	Morning Keynote by Maria Grazia Giuffreda (CSCS)

9:15 a.m. – 12:15 p.m.	Distributed GPU Computing Tutorial

12:15 – 1:00 p.m.	Lunch Bags and Farewell

Advanced HPC & AI knowledge: Learn advanced topics such as performance engineering, containerization, scalable AI, and HPC-driven visualization.
Networking & community: Make valuable contacts with like-minded people, experts, and leading HPC centers.
Learn from top experts: International speakers from TUM, ETH Zürich Campus Heilbronn, JSC, HLRS, CSCS and LRZ.
Practice-oriented project work: Contribute your own research questions and work on them directly with subject matter experts.

Advanced master's students, doctoral candidates, young scientists (postdocs).
Students from all over the world who want to deepen their expertise in the field of mainframe computing.
Participants who already have experience with HPC and AI.

Date: June 16 to June 20, 2026

Location: TUM Campus Heilbronn, Germany

Application deadline: May 29, 2026

Language: The program will be conducted entirely in English.

Prerequisites: We aim for participants to share a foundational understanding of Machine Learning and statistics, along with experience using HPC systems. Familiarity with the UNIX/Linux OS and basic programming tools (such as terminals, editors, compilers, and Python) will be highly beneficial.

Certificate: All participants will receive an issued Certificate by TUM upon successful completion.

Program fee: There is no participation fee for attending the summer school. The program includes the academic and social program as well as support from the organizers. Accommodation, as well as travel to and from the event, are not included and must be arranged and paid for by the participants themselves.

Focus: Advanced high-performance computing (HPC) techniques for machine.