Information

  • Publication Type: Master Thesis
  • Workgroup(s)/Project(s):
  • Date: November 2025
  • Date (Start): April 2025
  • Date (End): November 2025
  • TU Wien Library: AC17718798
  • Diploma Examination: 24. November 2025
  • Open Access: yes
  • First Supervisor: Michael WimmerORCID iD
  • Pages: 86
  • Keywords: GPU Acceleration, Computational Fluid Dynamics, CUDA, Urban Microclimate Modeling, Poisson Equation Solver, Fast Fourier Transform, Mixed-Precision Methods, Climate-Resilient Urban Planning

Abstract

Due to climate change, the frequency and severity of extreme weather events is increasing, which endangers human livelihoods and key infrastructures. Decision support tools can guide the development of climate-resilient cities by providing information on the potential effectiveness of specific measures during the planning process. In urban environments,decision support tools that incorporate accurate micro-climate models are particularly effective. PALM-4U, a state-of-the-art, scientifically validated microclimate model, could offer this functionality, however it remains largely inaccessible outside the scientific community as it is optimised to run on HPC clusters. However, with the rise of high-performance GPUs, a shift towards single workstations is possible.This study investigates the potential for performance increase of the PALM-4U’s pressuresolver, by utilising the GPU’s acceleration potential in combination with a change intarget architecture. Performance increase is measured using three parameters: speed up,validity (via NMSE, R, and FB), and memory efficiency. Also the effect on the runtime of the full simulation is measured and possible bottlenecks identified. Finally, the fullmodel is analysed to assess the overall feasibility of GPU optimisation, providing insights to guide future development.The pressure solver transforms the 3D Poisson equation using Fast Fourier Transform and solves the resulting 1D system via the Thomas algorithm. The code structure is optimised, CUDA-optimised kernels are implemented and the cuFFT library is integrated.In addition a mixed-precision approach is tested to evaluate its impact on performance and accuracy.The single core GPU implementation achieves a speed up of up to 65.5 times in single precision and up to 49.3 times for double precision for large domain sizes. The stability of the system remains unaffected by the mixed-precision approach, and no significant variation is observed between FP32 and FP64 runs. After 45 × 103 simulation steps, NMSE (0.02), FB (-0.017) and R (0.96), demonstrate a stable and accurate performance consistent across precisions. Additionally, the memory requirement is reduced up to 68% compared to the baseline CPU solver. The optimisations leads to a runtime reduction ofthe full model by 15%, demonstrating the potential for accessible, scientifically validated microclimate models.

Additional Files and Images

Weblinks

BibTeX

@mastersthesis{north-2025-aog,
  title =      "Analysis of the GPU Acceleration Potential of the FFT-Based
               Pressure Solver in the PALM-4U Model System",
  author =     "Stefanie North",
  year =       "2025",
  abstract =   "Due to climate change, the frequency and severity of extreme
               weather events is increasing, which endangers human
               livelihoods and key infrastructures. Decision support tools
               can guide the development of climate-resilient cities by
               providing information on the potential effectiveness of
               specific measures during the planning process. In urban
               environments,decision support tools that incorporate
               accurate micro-climate models are particularly effective.
               PALM-4U, a state-of-the-art, scientifically validated
               microclimate model, could offer this functionality, however
               it remains largely inaccessible outside the scientific
               community as it is optimised to run on HPC clusters.
               However, with the rise of high-performance GPUs, a shift
               towards single workstations is possible.This study
               investigates the potential for performance increase of the
               PALM-4U’s pressuresolver, by utilising the GPU’s
               acceleration potential in combination with a change intarget
               architecture. Performance increase is measured using three
               parameters: speed up,validity (via NMSE, R, and FB), and
               memory efficiency. Also the effect on the runtime of the
               full simulation is measured and possible bottlenecks
               identified. Finally, the fullmodel is analysed to assess the
               overall feasibility of GPU optimisation, providing insights
               to guide future development.The pressure solver transforms
               the 3D Poisson equation using Fast Fourier Transform and
               solves the resulting 1D system via the Thomas algorithm. The
               code structure is optimised, CUDA-optimised kernels are
               implemented and the cuFFT library is integrated.In addition
               a mixed-precision approach is tested to evaluate its impact
               on performance and accuracy.The single core GPU
               implementation achieves a speed up of up to 65.5 times in
               single precision and up to 49.3 times for double precision
               for large domain sizes. The stability of the system remains
               unaffected by the mixed-precision approach, and no
               significant variation is observed between FP32 and FP64
               runs. After 45 × 103 simulation steps, NMSE (0.02), FB
               (-0.017) and R (0.96), demonstrate a stable and accurate
               performance consistent across precisions. Additionally, the
               memory requirement is reduced up to 68% compared to the
               baseline CPU solver. The optimisations leads to a runtime
               reduction ofthe full model by 15%, demonstrating the
               potential for accessible, scientifically validated
               microclimate models.",
  month =      nov,
  pages =      "86",
  address =    "Favoritenstrasse 9-11/E193-02, A-1040 Vienna, Austria",
  school =     "Research Unit of Computer Graphics, Institute of Visual
               Computing and Human-Centered Technology, Faculty of
               Informatics, TU Wien",
  keywords =   "GPU Acceleration, Computational Fluid Dynamics, CUDA, Urban
               Microclimate Modeling, Poisson Equation Solver, Fast Fourier
               Transform, Mixed-Precision Methods, Climate-Resilient Urban
               Planning",
  URL =        "https://www.cg.tuwien.ac.at/research/publications/2025/north-2025-aog/",
}