LOTOS-EUROS Optimization

LOTOS-EUROS is a computer model to predict ozone, particulate matter and aerosols for the bottom layer of the atmosphere. It is used for the daily predictions of air quality, among many other applications. The main part of the code was already parallelized. Yet, when asked by KNMI to look into the code, VORtech managed to increase the performance by another 30%.

Speeding-up LOTOS-EUROS

The LOTOS-EUROS model was developed by TNO and RIVM. It has a long history and today it is one of the leading models in its field in Europe. As it is also used for operational purposes, it is important that the run time is kept to a minimum. Therefore, the main part of the code had already been parallelized to make good use of the computing hardware at KNMI. However, the performance turned out to be far less than expected and was varying considerably when switching from one computing platform to another.

Steps to a higher performance

Improvements on the performance were achieved in a number of steps:

  • We started by analysing the performance with our own tools on our own computers. That confirmed the results that had been found before on the computers of KNMI.
  • Then, we took a closer look at the computations that were causing the problems:
    • In one case, it was possible to shift the parallelization to a higher level. This makes it possible to distribute larger chunks of work and consequently reduces the overhead for starting and stopping parallel threads. A downside was that we had to introduce extra arrays to store intermediate results.
    • At one location in the code, an interpolation was done repeatedly. That could be made faster by storing the interpolation weights instead of recomputing them every time.
  • Finally, we did extensive performance tests to see if the modifications were sufficient.

A good result

In the end, the reference computation took 80 seconds where it took 125 seconds before (using 8 cores). On a single-core the computation took 280 seconds. A speed-up of only three seems little for eight cores, but that is because large parts of the code have not yet been parallelized. As the parallel part gets faster, the non-parallelized parts start to dominate the overall performance.

Can your software be faster?
Read about our services for optimizing software and our expertise on High Performance Computing. Feel free to contact us; we'll be happy to come round and discuss what we can do for you.
Contact us