Information
- Publication Type: Master Thesis
- Workgroup(s)/Project(s):
- Date: November 2025
- Date (Start): April 2025
- Date (End): November 2025
- TU Wien Library: AC17718798
- Diploma Examination: 24. November 2025
- Open Access: yes
- First Supervisor: Michael Wimmer

- Pages: 86
- Keywords: GPU Acceleration, Computational Fluid Dynamics, CUDA, Urban Microclimate Modeling, Poisson Equation Solver, Fast Fourier Transform, Mixed-Precision Methods, Climate-Resilient Urban Planning
Abstract
Due to climate change, the frequency and severity of extreme weather events is increasing, which endangers human livelihoods and key infrastructures. Decision support tools can guide the development of climate-resilient cities by providing information on the potential effectiveness of specific measures during the planning process. In urban environments,decision support tools that incorporate accurate micro-climate models are particularly effective. PALM-4U, a state-of-the-art, scientifically validated microclimate model, could offer this functionality, however it remains largely inaccessible outside the scientific community as it is optimised to run on HPC clusters. However, with the rise of high-performance GPUs, a shift towards single workstations is possible.This study investigates the potential for performance increase of the PALM-4U’s pressuresolver, by utilising the GPU’s acceleration potential in combination with a change intarget architecture. Performance increase is measured using three parameters: speed up,validity (via NMSE, R, and FB), and memory efficiency. Also the effect on the runtime of the full simulation is measured and possible bottlenecks identified. Finally, the fullmodel is analysed to assess the overall feasibility of GPU optimisation, providing insights to guide future development.The pressure solver transforms the 3D Poisson equation using Fast Fourier Transform and solves the resulting 1D system via the Thomas algorithm. The code structure is optimised, CUDA-optimised kernels are implemented and the cuFFT library is integrated.In addition a mixed-precision approach is tested to evaluate its impact on performance and accuracy.The single core GPU implementation achieves a speed up of up to 65.5 times in single precision and up to 49.3 times for double precision for large domain sizes. The stability of the system remains unaffected by the mixed-precision approach, and no significant variation is observed between FP32 and FP64 runs. After 45 × 103 simulation steps, NMSE (0.02), FB (-0.017) and R (0.96), demonstrate a stable and accurate performance consistent across precisions. Additionally, the memory requirement is reduced up to 68% compared to the baseline CPU solver. The optimisations leads to a runtime reduction ofthe full model by 15%, demonstrating the potential for accessible, scientifically validated microclimate models.
Additional Files and Images
Weblinks
BibTeX
@mastersthesis{north-2025-aog,
title = "Analysis of the GPU Acceleration Potential of the FFT-Based
Pressure Solver in the PALM-4U Model System",
author = "Stefanie North",
year = "2025",
abstract = "Due to climate change, the frequency and severity of extreme
weather events is increasing, which endangers human
livelihoods and key infrastructures. Decision support tools
can guide the development of climate-resilient cities by
providing information on the potential effectiveness of
specific measures during the planning process. In urban
environments,decision support tools that incorporate
accurate micro-climate models are particularly effective.
PALM-4U, a state-of-the-art, scientifically validated
microclimate model, could offer this functionality, however
it remains largely inaccessible outside the scientific
community as it is optimised to run on HPC clusters.
However, with the rise of high-performance GPUs, a shift
towards single workstations is possible.This study
investigates the potential for performance increase of the
PALM-4U’s pressuresolver, by utilising the GPU’s
acceleration potential in combination with a change intarget
architecture. Performance increase is measured using three
parameters: speed up,validity (via NMSE, R, and FB), and
memory efficiency. Also the effect on the runtime of the
full simulation is measured and possible bottlenecks
identified. Finally, the fullmodel is analysed to assess the
overall feasibility of GPU optimisation, providing insights
to guide future development.The pressure solver transforms
the 3D Poisson equation using Fast Fourier Transform and
solves the resulting 1D system via the Thomas algorithm. The
code structure is optimised, CUDA-optimised kernels are
implemented and the cuFFT library is integrated.In addition
a mixed-precision approach is tested to evaluate its impact
on performance and accuracy.The single core GPU
implementation achieves a speed up of up to 65.5 times in
single precision and up to 49.3 times for double precision
for large domain sizes. The stability of the system remains
unaffected by the mixed-precision approach, and no
significant variation is observed between FP32 and FP64
runs. After 45 × 103 simulation steps, NMSE (0.02), FB
(-0.017) and R (0.96), demonstrate a stable and accurate
performance consistent across precisions. Additionally, the
memory requirement is reduced up to 68% compared to the
baseline CPU solver. The optimisations leads to a runtime
reduction ofthe full model by 15%, demonstrating the
potential for accessible, scientifically validated
microclimate models.",
month = nov,
pages = "86",
address = "Favoritenstrasse 9-11/E193-02, A-1040 Vienna, Austria",
school = "Research Unit of Computer Graphics, Institute of Visual
Computing and Human-Centered Technology, Faculty of
Informatics, TU Wien",
keywords = "GPU Acceleration, Computational Fluid Dynamics, CUDA, Urban
Microclimate Modeling, Poisson Equation Solver, Fast Fourier
Transform, Mixed-Precision Methods, Climate-Resilient Urban
Planning",
URL = "https://www.cg.tuwien.ac.at/research/publications/2025/north-2025-aog/",
}