Forecasting urban flooding with machine learning

Can we reliably forecast and manage urban flooding with a machine learning model? That was a question that was raised at Deltares, a knowledge institute for water and the subsurface. Initial research was promising. As VORtech is building up its expertise in machine learning for physics-based modelling, we welcomed the opportunity to join Deltares and Delft University of Technology in further explorations.

Why fast models for urban flooding are important

Urban flooding is expected to become more serious in the coming years as the climate changes. At the same time, the economic value of cities is increasing as ever more people move to urban environments, economic activity grows and properties values increase. Thus, urban flooding will not only occur more frequently but also have a higher economic impact. Good tools to forecast and deal with urban flooding are therefore extremely valuable, as they can timely warn and aid in reducing the impacts of heavy rainfall events that can cause flooding.

Deltares has developed the DHYDRO-1D2D model for this purpose. That model is entirely physics-based and reliable, but it takes too long in real-time forecasting systems where water managers need to make fast decisions within minutes to hours. Therefore, Deltares is investigating whether machine learning can be used to create faster urban flooding surrogate models that can mimic flood simulations with the physics-based model.

Machine learning for physics-based applications

A challenge in using machine learning for applications like these is that it is difficult to create a model that is reliable enough to be used in crisis situations.

A major problem is that the amount of training data usually pales in comparison to the data sets that are used to train for example large language models. Floods are rare events, which means that providing sufficient cases to train a model on becomes challenging. Also, observational data comes with noise, bias and outright errors. The observational data can be augmented somewhat with results from simulations. But even those are usually limited as simulation models for this kind of applications are typically computationally expensive.

To compensate for the lack of training data from observations, it is best to also use whatever knowledge we have about the underlying structure and physics of what needs to be modelled. In the case of the urban flooding model, using the structure of the sewer system provides the machine learning model with a lot of extra information.

VORtech is actively building up its competence in the kind of machine learning that uses structural or scientific information in addition to observations.

We feel that our clients can reap tremendous benefits from machine learning for their physics-based applications. For example, they can use machine learning to create models that are extremely fast, allowing real-time responses, the assessment of a far wider range of scenarios than what is currently possible or evaluation of uncertainties. Β Other typical use cases of machine learning for our clients include leveraging observational data to improve or correct models.

To further our experience with this kind of machine learning, we welcomed the opportunity to work with experts from Deltares and Delft University of Technology who had already been exploring these techniques for urban flood forecasting. This collaboration took the form of jointly supervising an intern, Max Verton. It proved to be a great way to learn together.

Using the structure of the sewer network in the machine learning approach

As mentioned, the structure of the sewer network is an obvious candidate to use in training a machine learning model. For that, the machine learning concept of graph neural networks is a natural fit. It is basically a neural network where the architecture reflects the structure of an underlying graph, which in this case models the sewer network.

A graph neural network for sewer systems had already been developed at Delft University of Technology. Deltares was further applying and extending this model to flooding applications. The internship assignment that we gave to Max Verton was to make it work for the real-life case of a neighbourhood in Eindhoven.

This involved the usual steps when training a machine learning model: getting the data in the proper form, determining whether architectural tweaks could be beneficial, finding the best hyperparameters and rigorously testing the outcomes. Fortunately, Max proved to be a fast learner and he did an amazing job in only three months, with the active support from all parties involved.

He trained the graph neural network for a small part of Eindhoven’s sewer network, using simulated results from DHYDRO-1D2D to feed information on the physics into the model. These results were also used for validation, where the model was used to forecast water levels and flooding events beyond the timeframe on which it was trained.

The model that Max developed works surprisingly well. Looking at the forecasted water levels, the graph neural network manages to come to within some ten centimetres of the actual water levels. For detecting flood events, it reached a critical success index of about 85%. This indicates that it forecasts by far most of the flooding events correctly. As the model was trained on a small subsection of the network that did not have all the features and elements of the full network, it was not expected to work well when applied for the entire Eindhoven network. A small experiment confirmed this.

Looking ahead

The end of the internship is hopefully not the end of the collaboration between Deltares, Delft University of Technology and VORtech. There is still much work to be done to turn the current prototype into a robust and reliable application.

VORtech is open more broadly for opportunities to apply and grow our skills in machine learning for applications with a physical background. Contact us if you feel that these techniques might help you make good use of data that you have to create faster and more reliable models.