The ICCS Machine Learning Coupling Workshop

Talks and Speakers

Stochastic hybrid modeling for weather and climate science

Hannah Christensen - University of Oxford

Abstract TBC.

Associate professor and head of the Atmospheric Processes group at the University of Oxford. Researching uncertainty quantification of parameterisation schemes in numerical models and leading the Model Uncertainty-Model Intercomparison Project (MUMIP). She is also exploring the application of machine learning methods in the domain.

Active learning for Emulation

Christopher Sprague - Alan Turing Institute

Abstract TBC.

Senior Research Associate at the Alan Turing Institute.

Title TBC: Differentiable programming

Valentin Churavy - University of Augsburg

Abstract TBC.

Improving physical models of the atmosphere using ML

Cyril Morcrette - Met Office

The Met Office develops and maintains a unified modelling framework used for weather forecasting and climate applications. Simulations using this model can be used to feed product and deliver advice and warnings, they can also be used to provide the next generation of km-scale high-fidelity datasets for ML training. As a result, these physical simulations need to be the best they can be, and ML can be used to improve certain parametrization schemes and interactions between processes. An overview of projects aiming to improve the physical model by embedding ML techniques will be provided. We will also provide an overview of the software development that allow us to fuse ML techniques with our physical model is a manner which is robust to changes in the modelling landscape.

Cyril leads a team improving the way that clouds and radiation are represented in weather forecasts and climate simulations.

ML-Emulator for Cloud Microphysics in ICON in a Realistic Climate Model Experiment

Caroline Arnold - Helmholtz-Zentrum Hereon

As the spatial resolution of general circulation models (GCMs) increases and storms and clouds can be resolved, the underlying cloud microphysics still need to be parameterised. This is a known to be a major source of uncertainty in climate and weather simulations. The established parameterisations use bulk moment schemes, where the conversion of cloud and rain droplets is approximated through empirical relationships. Particle-based superdroplet simulations would provide a more accurate representation, but are typically not feasable for use in GCMs.

We couple SuperdropNet [1], an ML emulator for warm rain cloud microphysics trained on superdroplet simulations, to ICON [2]. Previously, we validated the coupled model in an idealised cloud microphysics test case and showed that SuperdropNet runs stable and provides reasonable precipitation patterns [3].

Now we move towards a realistic climate model experiment with 10 km horizontal resolution. We use historical greenhouse gas forcing and observed sea-surface temperatures as boundary conditions (a so called AMIP experiment) to compare if and how our hybrid model simulates more realistic precipitation compared to the widely used empirical two moment bulk scheme parameterisation.

Coupling SuperdropNet to ICON is achieved using FTorch. We are able to run ICON on 128 nodes on the CPU partition of the HPC system Levante with minimal overhead. Conditions beyond the training data range of SuperdropNet lead to negative feedback loops and impact the long-term stability of the coupled simulation. We implement physics-based constraints that improve stability. We run the hybrid model and the reference simulation for seven days and present first results. Furthermore, we test an autoregressive rollout of SuperdropNet that allows for longer GCM time steps and investigate how this impacts stability and results.

REFERENCES:

[1] Sharma, S., and Greenberg, D.: "SuperdropNet: a Stable and Accurate Machine Learning Proxy for Droplet-based Cloud Microphysics." JAMES, 2025.
[2] 10.35089/WDCC/IconRelease01
[3] Arnold, C., Sharma, S., Weigel, T., and Greenberg, D. S.: Efficient and stable coupling of the SuperdropNet deep-learning-based cloud microphysics (v0.1.0) with the ICON climate and weather model (v2.6.5), Geosci. Model Dev., 17, 4017-4029, 10.5194/gmd-17-4017-2024, 2024.

Hybrid modelling of the atmosphere with ICON

Julien Savre - DLR

Abstract TBC.

Hybrid machine learning and data assimilation in weather forecasting

Alan Geer - ECMWF

Traditional numerical weather prediction (NWP) takes observations of the earth system, combines these observations with physical models using data assimilation in order to create "initial conditions", and then runs a physically-based model from these initial conditions to forecast future weather. Data-driven forecasts, based on machine learning, are now in some respects more accurate than those made by physical models. Attempts are also being made to replace the data assimilation process to create an "end-to-end" data-driven forecasting system that directly converts observations into weather forecasts. However, there are strong arguments for retaining the physical approach, including the Bayesian perspective that the most accurate posterior knowledge of the atmosphere should come from the combination of observationally-based knowledge with prior knowledge, meaning physical equations and other knowledge embedded in numerical models. But the quality of data-driven forecasts suggests that existing physical forecast models have substantial errors and need to be improved. This motivates a hybrid physical-empirical approach to weather forecasting, using empirical components to improve or augment the physical approach. One approach is to learn systematic error corrections that can be applied to the physical forecast model every few hours, or at the timestep level. A more "granular" hybrid approach targets the most uncertain components of a forecasting system, which are replaced or augmented by data-driven components, including more focused systematic error corrections, while other components may remain entirely physical. Both approaches are being explored inside the physically-based Integrated Forecasting System (IFS) of the European Centre for Medium-range Weather Forecasts. The error correction approach is in testing and significantly improves the quality of physically-based forecasts, bringing them closer in quality to data-driven equivalents, while retaining most benefits of the physical approach. A granular hybrid is already used operationally in the forward modelling of satellite radiance observations in sea ice areas, where no sufficiently accurate physical models and state information were previously available. This opens up a new area for physically-based NWP. In both cases, a major question is how best to train and maintain the empirical components, especially when the surrounding physical model versions are upgraded. This likely involves a combination of pre-training offline and fine-tuning online. The aim is to continue to optimise the machine learning components within the data assimilation system for weather forecasting, most likely decoupled from the process of obtaining initial conditions. From a technical point of view, the empirical components have been implemented in Python-based machine learning packages for the offline training and then rewritten in Fortran and C++ when used in NWP systems. As solutions to these issues become more developed, they may define the architecture of hybrid environmental prediction systems for decades to come.

Principal Scientist at ECMWF.

The use of turbulence surrogate models in plasma integrated modelling

William Hornsby - UKAEA

Plasma micro-turbulence is one of the dominant transport mechanisms of heat from the core of a fusion power plant. Direct numerical calculation of the micro-instabilities that form turbulence is computationally expensive and is a significant bottleneck in integrated plasma modelling, in which the many physical processes are coupled to predict reactor-level behaviour and to optimise operational scenarios of fusion power plants. The considerable number of geometric and thermodynamic parameters, the interactions that influence the turbulence and the resolutions needed to accurately resolve these turbulent modes, makes direct numerical simulation for parameter space exploration computationally extremely challenging. However, this makes it suitable for surrogate modelling, where speed ups of up to 10⁵ are possible making rapid scenario development a possibility. In this talk the integrated plasma modelling use-case will be introduced as well as the turbulence surrogate modelling efforts at UKAEA, including how the models are integrated into larger workflows.

Scientific software engineer specialising in modern HPC architectures and machine learning for simulation in fusion applications.

9:00-9:30	Arrival and Coffee
9:30-9:45	Welcome
9:45-11:15	Talks Session 1
11:15-11:45	Break
11:45-12:45	Talks Session 2
12:45-14:00	Lunch
14:00-15:30	Talks Session 3
15:30-16:00	Break
16:00-16:45	Panel Discussion
16:45-17:00	Introduction to breakout groups
17:00-19:00	Poster session with drinks and an informal dinner

9:00-9:30	Arrival and Coffee
9:30-12:00	Breakout Groups Stability and Uncertainty Coupling Interfaces Differentiable Models and Online Training Hardware Architectures Research to Operations Coffee available at 10:30
12:00-13:00	Feedback Session
13:00-14:00	Lunch

ML Coupling Workshop

Programme

Wednesday 3rd September

Thursday 4th September

Talks and Speakers

Stochastic hybrid modeling for weather and climate science

Active learning for Emulation

Title TBC: Differentiable programming

Improving physical models of the atmosphere using ML

ML-Emulator for Cloud Microphysics in ICON in a Realistic Climate Model Experiment

Hybrid modelling of the atmosphere with ICON

Hybrid machine learning and data assimilation in weather forecasting

The use of turbulence surrogate models in plasma integrated modelling