ECCOMAS 2024

A Pencil-Decomposed Numerical Algorithm for Many-GPU Calculations of Turbulent Wall Flows at High Reynolds Number

  • Diez Sanhueza, Rafael (Delft University of Technology)
  • Peeters, Jurriaan (Delft University of Technology)
  • Costa, Pedro (Delft University of Technology)

Please login to view abstract download link

Turbulent flows at high Reynolds numbers correspond to one of the most complex unsolved problems in engineering and physics, since their dynamics can only be fully-resolved through massive simulations. In this work, we extended a Direct Numerical Simulation (DNS) solver with a distributed parallel tridiagonal solver for calculations on multiple CPU or GPU devices. The solver has been implemented in a two-dimensional domain decomposition setting that was observed to scale up to thousands of GPUs in the flagship European supercomputer Leonardo. Additionally, we propose a reformulated flavor of the parallel tridiagonal solver to reduce the number of arithmetic operations and data communication in multi-GPU configurations. We investigated the influence of 2D pencil decompositions for large-scale DNS studies of wall-bounded flows up to about 95 billion grid points (corresponding to a friction Reynolds number of 10,000) and 1024 NVIDIA A100 GPUs. In such a setting, one-dimensional decompositions become inefficient and even impossible due to limited memory, or a hard limit regarding the maximum number of GPUs. Our results show that, at scale, the Poisson solver is about 2 times faster than the previous version based on the full-transpose method. Strong scalability tests show compelling performance gains in the present approach. In the weak scalability tests with approximately 300 million grid points per GPU, the performance degradation in the Poisson solver is about 7% when increasing the number of GPUs from 32 to 256, whereas the previous full-transpose method has a relative degradation of up to 40%. The approach, therefore, offers substantially better performance for large-scale DNS studies under physically-relevant conditions.