Geometric and Physical Constraints Synergistically Enhance Neural PDE Surrogates

1Helmholtz Centre Hereon, 21502 Geesthacht. 2Helmholtz AI
Published in: Forty-Second International Conference on Machine Learning (ICML 2025)

p4m/M Long Rollout of Prediction for the Shallow Water System

On the left-hand side is a comparison of the p4m/M rollout for surface disturbance (ζ) with the reference. The right-hand side shows the accuracy scores (NRMSE-ζ and correlation) over time.

Abstract

Neural PDE surrogates can improve the cost-accuracy tradeoff of classical solvers, but often generalize poorly to new initial conditions and accumulate errors over time. Physical and symmetry constraints have shown promise in closing this performance gap, but existing techniques for imposing these inductive biases are incompatible with the staggered grids commonly used in computational fluid dynamics. Here we introduce novel input and output layers that respect physical laws and symmetries on the staggered grids, and for the first time systematically investigate how these constraints, individually and in combination, affect the accuracy of PDE surrogates. We focus on two challenging problems: shallow water equations with closed boundaries and decaying incompressible turbulence. Compared to strong baselines, symmetries and physical constraints consistently improve performance across tasks, architectures, autoregressive prediction steps, accuracy measures, and network sizes. Symmetries are more effective than physical constraints, but surrogates with both performed best, even compared to baselines with data augmentation or pushforward training, while themselves benefitting from the pushforward trick. Doubly-constrained surrogates also generalize better to initial conditions and durations beyond the range of the training data.

Neural PDE Surrogates

The present study focuses on time-dependent variable fields w(t, x) for a fluid system. The general governing partial differential equations (PDEs) for the system can be written as follows:

wt= F(t, x, w, ∇w, ∇2w, . . .)

with initial conditions and boundary conditions, where t and x are expressed in terms of time and space, respectively. We aim to train neural networks to predict the time evolution of a system of PDEs. Thus, the dynamical system can be written: as:

w(t+∆t, ·) = G[w(t, ·)]

where G is an update operator of neural network. To provide training data and evaluate performance we use a reference solution generated by a numerical solver with space- and time-discretized variable fields.

Symmetry- and Physics-Constrained Neural Surrogates

In this work, we assess the separate and combined benefits of symmetries and conservation laws for neural PDE surrogates. To achieve this, we construct equivariant input layers that support staggered grids, as well as output layers enforcing both equivariance and conservation laws.

gfglogo
Symmetry- and physics-constrained neural surrogate for incompressible flow on a staggered grid. A rotation-equivariant input layer maps velocities onto an unstaggered regular representation, hidden layers employ steerable convolutions and the equivariant output layer enforces conservation laws on mass and momentum as it maps to staggered velocities.

Figure demonstrates our overall framework for constructing equivariant, conservative neural surrogates. As an illustrative example, we show the incompressible Navier Stokes equations, with equivariance in translation and rotation, momentum conservation and a divergence-free condition (equivalent to mass conservation). Input data defined on staggered grids are mapped through novel equivariant input layers to a set of convolutional output channels defined at grid cell centers. Each internal activations consist of regular representations: groups of channels indexed by G on which G acts by transforming each spatial field and by permuting the channels according to the group action. Essentially, regular representations are real-valued functions of the discrete symmetry group G. This formulation allows us to use the preexisting library escnn for all internal linear transformations between hidden layers. Finally, we employ novel output layers to map the regular representation back to the staggered grid while enforcing conservation laws as hard constraints.

In the following section, the emphasis will be placed on the presentation of figures that have been selected on the basis of their relevance to the subject of symmetry in the input layer of networks and entire networks.

Experiments - 2D Shallow Water System

The shallow water equations (SWEs) are widely utilised to describe quasi-static motion in a homogeneous incompressible fluid with a free surface. The focus of this study is nonlinear SWEs in momentum- and mass conservative form. The intricacies of the formulations can be found in our paper!

We first trained and evaluated neural surrogates for the SWE task. We followed a hybrid learning strategy, based on the observation that the semiimplicit numerical integration scheme calculates ζt+1 slowly with an iterative solver, but then calculates [ut+1, vt+1] given ζt+1 quickly through a mathematical formula. We therefore trained surrogates to predict only ζt+1, and calculat [ut+1, vt+1]. Keeping parameter counts constant, we compared networks equivariant to 3 symmetry groups: p1 (translation only, as in standard CNNs), p4 (translation-rotation) and p4m (translation-rotation-reflection). We also compared mass conserving networks (M) to those without physical constraints (∅). Moreover, a comparison was made with unconstrained FNO and drnet architectures. The primary results are presented in the following figure and table.

In the subsequent phase of the study, an investigation was conducted into the potential for generalisation to novel ICs for closed shallow water systems. Figure shows an example in which these rectangles have combined to form an ‘L’ shape which is a sum of two rectangular elevations 0.1 m in height, with randomly varying location and shape. As previously, we found the maximally constrained model p4m/M to outperform alternatives with equal parameter counts. Additional generalization results can be found in our paper!

gfglogo
Generalization beyond training data for closed shallow water systems. (a) SWE rollouts from p1/∅ and p4m/M on L-shaped ICs are compared to reference. (b-c) Accuracy of each network over six generalization tests.

Experiments - 2D Decaying Turbulence

The incompressible Navier–Stokes equations (INS) describe momentum balance for incompressible Newtonian fluids. we consider the ‘decaying turbulence’ scenario introduced by (Kochkov et al., 2021). The velocity field is initialized as filtered Gaussian noise containing high spatial frequencies. Predicting the evolution of the velocity field is challenging, since eddy size and Reynolds number change over time as structures in the flow field coalesce, and the velocity field becomes smoother and more uniform over time.

Here we used the velocity fields [u, v] as both inputs and outputs. As for SWEs, we tested p1, p4 and p4m equivariance, but now considered 3 levels of physical constraints: unconstrained (∅), momentum conservation (ρ u) and mass/momentum conservation (M+ρ u).

Following Figure compares autoregressive rollouts from unconstrained (p1/∅) and maximally constrained networks (p4m/M+ρ u). We observed improvements in the accuracy and stability of the INS surrogates for both types of constraints (see Fig. b–c). The best results also outperformed those of networks trained with input noise. To evaluate the performance of the neural surrogates beyond the point at which their predictions diverge from the reference solution, we compared the power spectra of the predicted and energy velocity fields as in previous studies. Furthermore, we compared the performance gains from symmetries and physical constraints with those offered by specialised training modes for PDE surrogates, such as data augmentation and pushforward. See the following figures and tables for details.

We tested surrogates on ICs with peak wavenumber changed from 10 to 8 or 6. p4m/M+ρ u rollouts more closely matched the reference solver and its spectra, see the following figures.

gfglogo
Generalization beyond training data for decaying turbulence. Generalization beyond training data. (a) INS rollouts from p1/∅ and p4m/M+ρ u on ICs with peak wavenumber of 8. (b-c) Velocityand energy spectra for INS at t = 99.7 s, averaged over 10 ICs.

We investigated how the enhancement of neural INS surrogates by symmetry physical constraints depends on the size of the network and dataset. We trained p1/∅ and p4m/M+ρ u networks with 0.1M, 2M and and 8.5M parameters on the same dataset (100 simulations). Training 0.1M-parameter p1/∅ and p4m/M+ρ u networks on datasets of 100, 400, and 760 simulations showed that constraints enhanced performance robustly across dataset size, see details in following figures.

gfglogo
Accuracy of symmetry- and physics-constrained INS models across data and network sizes, at 4.2 and 12.6 s. (ab) NRMSE-u and correlation vs. network size for p1/∅ and p4m/M+ρ u. (c-d) NRMSE-u and correlation for p1/∅ and p4m/M+ρ u vs. training datasets size.

Experiments - Real Ocean Dynamics

The ocean current velocity data has been sourced from Global Ocean Physics Analysis and Forecast. The 6-hourly data from 2022-06-01 to 2025-04-05 was selected for analysis in three regions from different oceans. The corresponding latitude and longitude ranges for these regions are listed below: (30∼42, 168∼180), (-49∼-37, 80∼92), and (-51∼-39, -30∼-18). The data set under consideration is 144×144. In order to reduce the size of the training set, it is downscaled to a coarse grid of 48×48 for training, testing and validation purposes.

Once the training of the model has been completed, the prediction is made using distinct regions and times from the training data. The latitude and longitude ranges of the prediction are specified as (-44∼-32, -130∼-118), and the time frame for the prediction is from 2023-06-01 to 2025-04-05. The figure below provides a visual representation of the specifics.

gfglogo
Accuracy of neural PDE surrogates for real world ocean currents. Results are shown for the unconstrained Modern U-net p1/∅, Wang’s Equrot (Unet), and the doubly constrained network p4m/M+ρ u. (a) Zonal ocean currents from the three surrogates reference data. (b-c) NRMSE-u and correlation for forecasts up to 300 hours in the future, averaged over 30 randomly selected initial conditions.

Furthermore, the following table shows the details of the comparison of nRMSE for zonal velocity, energy spectrum error (ESE) and the correction for real ocean currents, as predicted for 12 and 120 hours.

gfglogo
Comparison of nRMSE for zonal velocity, energy spectrum error (ESE), and correction for real ocean currents predicted 12h and 120h ahead. Architecture and hyperparameters for Equrot Unet are as described in Wang’s Equrot (Unet).

Video Presentation

Poster

BibTeX

@inproceedings{huang2025geometric,
  title={Geometric and Physical Constraints Synergistically Enhance Neural PDE Surrogates},
  author={Huang, Yunfei and Greenberg, David},
  booktitle={International conference on machine learning},
  year={2025},
  organization={PMLR}
  }