ECCOMAS 2024

An operator preconditioning perspective on training in physics-informed machine learning

De Ryck, Tim (ETH Zürich)
de Bézenac, Emmanuel (ETH Zürich)
Bonnet, Florent (Sorbonne Université)
Mishra, Siddhartha (ETH Zürich)

In session: MS005C - Deep Learning Computing III

Please login to view abstract download link

We investigate the behavior of gradient descent algorithms in physics-informed machine learning methods like physics-informed neural networks (PINNs), which minimize residuals connected to partial differential equations (PDEs). Our key result is that the difficulty in training these models is closely related to the conditioning of a specific differential operator. This operator, in turn, is associated to the Hermitian square of the differential operator of the underlying PDE. If this operator is ill-conditioned, it results in slow or infeasible training. Therefore, preconditioning this operator is crucial. We employ both rigorous mathematical analysis and empirical evaluations to investigate various strategies, explaining how they better condition this critical operator, and consequently improve training. More specifically, we quantify the condition number for linear models both in the preconditioned and unpreconditioned case, supported by empirical results. Experiments with nonlinear models such as neural networks also highlight the importance of proper preconditioning. Finally, we use the developed theory to shed a new light on various existing methods to improve training of physics-informed models such as (i) balancing terms in the loss function, (ii) hard versus soft boundary conditions, (iii) second-order optimizers such as Newton’s method and (iv) domain decomposition.