Neural PDE surrogates can improve the cost-accuracy tradeoff of classical solvers, but often generalize poorly to new initial conditions and accumulate errors over time. Physical and symmetry constraints have shown promise in closing this performance gap, but existing techniques for imposing these inductive biases are incompatible with the staggered grids commonly used in computational fluid dynamics. Here we introduce novel input and output layers that respect physical laws and symmetries on the staggered grids, and for the first time systematically investigate how these constraints, individually and in combination, affect the accuracy of PDE surrogates. We focus on two challenging problems: shallow water equations with closed boundaries and decaying incompressible turbulence. Compared to strong baselines, symmetries and physical constraints consistently improve performance across tasks, architectures, autoregressive prediction steps, accuracy measures, and network sizes. Symmetries are more effective than physical constraints, but surrogates with both performed best, even compared to baselines with data augmentation or pushforward training, while themselves benefitting from the pushforward trick. Doubly-constrained surrogates also generalize better to initial conditions and durations beyond the range of the training data.
The present study focuses on time-dependent variable fields w(t, x) for a fluid system. The general governing partial differential equations (PDEs) for the system can be written as follows:
wt= F(t, x, w, ∇w, ∇2w, . . .)
with initial conditions and boundary conditions, where t and x are expressed in terms of time and space, respectively. We aim to train neural networks to predict the time evolution of a system of PDEs. Thus, the dynamical system can be written: as:
w(t+∆t, ·) = G[w(t, ·)]
where G is an update operator of neural network. To provide training data and evaluate performance we use a reference solution generated by a numerical solver with space- and time-discretized variable fields.
In this work, we assess the separate and combined benefits of symmetries and conservation laws for neural PDE surrogates. To achieve this, we construct equivariant input layers that support staggered grids, as well as output layers enforcing both equivariance and conservation laws.
Figure demonstrates our overall framework for constructing equivariant, conservative neural surrogates. As an illustrative example, we show the incompressible Navier Stokes equations, with equivariance in translation and rotation, momentum conservation and a divergence-free condition (equivalent to mass conservation). Input data defined on staggered grids are mapped through novel equivariant input layers to a set of convolutional output channels defined at grid cell centers. Each internal activations consist of regular representations: groups of channels indexed by G on which G acts by transforming each spatial field and by permuting the channels according to the group action. Essentially, regular representations are real-valued functions of the discrete symmetry group G. This formulation allows us to use the preexisting library escnn for all internal linear transformations between hidden layers. Finally, we employ novel output layers to map the regular representation back to the staggered grid while enforcing conservation laws as hard constraints.
In the following section, the emphasis will be placed on the presentation of figures that have been selected on the basis of their relevance to the subject of symmetry in the input layer of networks and entire networks.
The shallow water equations (SWEs) are widely utilised to describe quasi-static motion in a homogeneous incompressible fluid with a free surface. The focus of this study is nonlinear SWEs in momentum- and mass conservative form. The intricacies of the formulations can be found in our paper!
We first trained and evaluated neural surrogates for the SWE task. We followed a hybrid learning strategy, based on the observation that the semiimplicit numerical integration scheme calculates ζt+1 slowly with an iterative solver, but then calculates [ut+1, vt+1] given ζt+1 quickly through a mathematical formula. We therefore trained surrogates to predict only ζt+1, and calculat [ut+1, vt+1]. Keeping parameter counts constant, we compared networks equivariant to 3 symmetry groups: p1 (translation only, as in standard CNNs), p4 (translation-rotation) and p4m (translation-rotation-reflection). We also compared mass conserving networks (M) to those without physical constraints (∅). Moreover, a comparison was made with unconstrained FNO and drnet architectures. The primary results are presented in the following figure and table.
In the subsequent phase of the study, an investigation was conducted into the potential for generalisation to novel ICs for closed shallow water systems. Figure shows an example in which these rectangles have combined to form an ‘L’ shape which is a sum of two rectangular elevations 0.1 m in height, with randomly varying location and shape. As previously, we found the maximally constrained model p4m/M to outperform alternatives with equal parameter counts. Additional generalization results can be found in our paper!
The incompressible Navier–Stokes equations (INS) describe momentum balance for incompressible Newtonian fluids. we consider the ‘decaying turbulence’ scenario introduced by (Kochkov et al., 2021). The velocity field is initialized as filtered Gaussian noise containing high spatial frequencies. Predicting the evolution of the velocity field is challenging, since eddy size and Reynolds number change over time as structures in the flow field coalesce, and the velocity field becomes smoother and more uniform over time.
Here we used the velocity fields [u, v] as both inputs and outputs. As for SWEs, we tested p1, p4 and p4m equivariance, but now considered 3 levels of physical constraints: unconstrained (∅), momentum conservation (ρ) and mass/momentum conservation (M+ρ).
Following Figure compares autoregressive rollouts from unconstrained (p1/∅) and maximally constrained networks (p4m/M+ρ). We observed improvements in the accuracy and stability of the INS surrogates for both types of constraints (see Fig. b–c). The best results also outperformed those of networks trained with input noise. To evaluate the performance of the neural surrogates beyond the point at which their predictions diverge from the reference solution, we compared the power spectra of the predicted and energy velocity fields as in previous studies. Furthermore, we compared the performance gains from symmetries and physical constraints with those offered by specialised training modes for PDE surrogates, such as data augmentation and pushforward. See the following figures and tables for details.
We tested surrogates on ICs with peak wavenumber changed from 10 to 8 or 6. p4m/M+ρ rollouts more closely matched the reference solver and its spectra, see the following figures.
We investigated how the enhancement of neural INS surrogates by symmetry physical constraints depends on the size of the network and dataset. We trained p1/∅ and p4m/M+ρ networks with 0.1M, 2M and and 8.5M parameters on the same dataset (100 simulations). Training 0.1M-parameter p1/∅ and p4m/M+ρ networks on datasets of 100, 400, and 760 simulations showed that constraints enhanced performance robustly across dataset size, see details in following figures.
The ocean current velocity data has been sourced from Global Ocean Physics Analysis and Forecast. The 6-hourly data from 2022-06-01 to 2025-04-05 was selected for analysis in three regions from different oceans. The corresponding latitude and longitude ranges for these regions are listed below: (30∼42, 168∼180), (-49∼-37, 80∼92), and (-51∼-39, -30∼-18). The data set under consideration is 144×144. In order to reduce the size of the training set, it is downscaled to a coarse grid of 48×48 for training, testing and validation purposes.
Once the training of the model has been completed, the prediction is made using distinct regions and times from the training data. The latitude and longitude ranges of the prediction are specified as (-44∼-32, -130∼-118), and the time frame for the prediction is from 2023-06-01 to 2025-04-05. The figure below provides a visual representation of the specifics.
Furthermore, the following table shows the details of the comparison of nRMSE for zonal velocity, energy spectrum error (ESE) and the correction for real ocean currents, as predicted for 12 and 120 hours.
@inproceedings{huang2025geometric,
title={Geometric and Physical Constraints Synergistically Enhance Neural PDE Surrogates},
author={Huang, Yunfei and Greenberg, David},
booktitle={International conference on machine learning},
year={2025},
organization={PMLR}
}