How will Machine Learning change Computing and Simulation?

The launch of ChatGPT made the world aware that Machine Learning will fundamentally change the way we live and work. The field of computing and simulation will be no exception. But it’s not just ChatGPT: there are other innovations from machine learning that are bringing exciting changes to computing. In this blog, we’ll explore the potential of this new technology.

Before we go into the technological side of things, let’s first consider what the world of applied computing may look like 10 years from now. We do that in the form of a short story. In this story, the sentences in italics refer to specific innovations from machine learning and AI that are elaborated in the second part of this blog. A major disclaimer applies: the story is far too optimistic about many things, but it does place all the innovations in context and that’s the purpose here.

Imagine the world of computing and simulation in 2033.

It is November 2033. Meet John. John trained as a civil engineer and works for the dredging company FutureDredging Inc. This company is one of the most innovative companies in their field, using the latest advances in machine learning and AI to optimize every aspect of their operations.

Last week, John was planning a new dredging operation off the coast of Nigeria. The area is tricky: the sea floor has a very irregular topology, and the currents are extremely complex. Fortunately, John has a digital assistant that creates a 3D simulation model for him. The model is highly detailed, with a resolution of a few meters where it matters. When John first ran a simulation, he noticed with some concern all the tiny little details in the flow that would make this assignment hard. With his head-mounted display, he has been studying the flow patterns to better understand the dynamics.

Once John felt sufficiently familiar with the environment, he asked the digital assistant to make forecasts for the dredging operation. He has come to rely on these forecasts; it doesn’t even strike him as special anymore that they tend to be highly accurate even in complex situations such as these. It’s only the uncertainty in the weather data that might affect the forecasts, but as the dredging operation draws near, these also become more accurate, and the dredging forecasts become more reliable every day.

Most of John’s work has been to plan the dredging operation and construct risk mitigation strategies. For this, he created a digital plan for the entire operation, including fallback options if something goes wrong. He sends the plan to the dredging vessel, which will perform most of the operations automatically.

For security reasons a human still needs to be in the loop. The operation lead, Kirstin, must approve each step of the operation and stop the operation if something goes wrong. A few operators will be onboard of the vessel just in case something happens that requires physical intervention.

Today is the first day of the operation. First, Kirstin and the operators need to be instructed. John sends them interactive movies of the planned operation which they can play on their head-mounted display. These are not just of the perfectly smooth operation, but also a couple of movies showing possible issues that might come up. This way, they can also train for the most likely contingencies.

At 12:00 precisely, Kirstin pushes the go-button on her display, instructing the vessel to start doing its work. It starts to dredge autonomously, being capable of compensating for small errors in the forecasts. Its control system uses the model that was created by John’s digital assistant to quickly adapt to disturbances.

But just half an hour into the operation, one of the cameras on the vessel signals a problem: the sea surface suddenly changes color. At the same time, the vessel reports that the material that it is dredging is no longer the expected material. Clearly, something is wrong. Kirstin quickly realizes what is going on: they somehow hit an oil pipe, and the oil is flowing out. This leads both to oil floating on the sea surface and oil mixing with the material that is being dredged. Kirstin curses: the pipe was nowhere in the model. Did John’s digital assistant make a mistake or was this pipe simply not registered? But she has no time to wonder. Action must be taken because there is a high penalty for polluting the environment.

She calls John to discuss what needs to be done. As always, John first asks his digital assistant what the best course of action is. The digital assistant quickly scans all similar events and proposes a couple of ways to handle the situation. As the data from the vessel and the cameras come in, the digital assistant rapidly builds a detailed model of what is going on. It infers the location and size of the leak and the likely pressure in the pipe.

Initially, there is a high degree of uncertainty as the data about the incident is limited. But with each new data item coming in, the digital assistant improves the model. At the same time, it uses the model to virtually run a couple of strategies to handle the situation. Within ten minutes of the first report of the incident, the digital assistant provides John with a plan. It seems sensible to him, so he sets everything up and sends it to Kirstin. She quickly writes a couple of routines to change the way the vessel operates, uploads them to the vessel’s systems and initiates the various actions.

Obviously, the story doesn’t end here. There is environmental damage and a claim from the authorities will come. Fortunately, FutureDredging is well covered by insurance. All the records of the incident and its handling, including the evaluations and proposals from the digital assistant are sent to the insurance company.

There, the digital assistant of the insurance company evaluates the incident and the way it was handled. It determines that the cause of the accident was the fact that the oil pipe was missing from the data that the digital assistant used. The claim from FutureDredging is dismissed as it was their responsibility to obtain the correct information. FutureDredging then turns to the provider of the digital assistant to claim the damages. It turns out that they are indeed liable in this case, so they must pay damages to FutureDredging. But in turn, the provider puts a claim to the agency of the Nigerian government that provided the faulty data.

The machine learning technology behind all this

This is not intended to be a prediction but only a rather contrived story to place the future benefits of machine learning into context. Taken literally, it is certainly far too optimistic in terms of what is feasible and skewed to the positive side of the effects.

In the rest of this blog, we’ll explain a bit about the technology behind the italicized sentences in the story. Much of it, but not all, is inspired by a very nice overview given by Mr. Willard from the University of Minnesota and his colleagues. The interpretation is all ours.

“Fortunately, John has a digital assistant that creates a 3D simulation model for him.”

Modelling an environmental situation can be quite a challenge today in 2023. Collecting and formatting all the data, generating a well-behaved grid that doesn’t give numerical issues and choosing the parameters often takes days or even weeks and requires highly specialized people.

However, many governments are actively working on making most of their data publicly available and in standard formats. This will make it much easier to get all the necessary data in a convenient form. Generating a grid and estimating parameters are tasks that might be automated using machine learning. So, it is likely that generating a model will be largely automated in the future. In fact, the company Tygron is already doing this in some form.

“The model is highly detailed, with a resolution of a few meters where it matters.”

The resolution is higher than what is possible with today’s models by applying a machine learning model that has been trained to fill in the details. It’s somewhat like filling in missing pieces of an image, which current generative AI can do convincingly. But we don’t want the machine learning model to just guess at what the missing details could be: it should learn to fill in details based on the context of the coarse model results and the applicable physics. Incorporating physics into machine learning is an important theme that is now a major research topic around the world. More on that later in this blog post.

“… they tend to be highly accurate even in complex situations such as these.”

Forecasts can be made more reliable with machine learning by using it to model aspects of the physical world that are difficult to model explicitly. Many computational models contain parameterizations to account for effects that we cannot properly describe. With machine learning we might be able to come up with better ways to incorporate these effects in computer models and hence improve the forecasts.

“… interactive movies of the planned operation which they can play on their head-mounted display.”

The realistic movies that occur at several points in the story are not a big stretch from what AI can already do. Even today, virtual reality is used for interacting with the results of computations. The concept is included in this story to highlight that this might become far more common than today as the technology becomes embedded in everyday life.

“Its control system uses the model that was created by John’s digital assistant.”

Here we come upon what is probably the most powerful application of machine learning for computing and simulation: replacing a heavy computational model with superfast machine learning models or surrogate models. It is introduced in the story as part of the control system of the vessel, but this concept occurs at many points in the story. It makes it possible for John to interactively explore the situation in the dredging area, and it allows for extensive optimization of the handling of the oil spill.

The most common way to build a surrogate model with machine learning is by training some sort of deep learning network. Once the network is trained, it can provide simulation results in the blink of an eye. An important aspect of this approach is that the laws of physics must be respected by the machine learning model. Otherwise, the simulation could produce fundamentally wrong results. There is a lot of research on this topic, and much is already possible.

Another issue in training a deep learning network for this type of application is that the available real-world data is often small and noisy. We could generate synthetic data by running simulations, but these are typically time consuming and expensive. Incorporating physics in the machine learning model helps here by providing the knowledge we already have about the relationships between the variables. This makes it possible to use machine learning even if the actual amount of data is relatively small.

There is one aspect that was carefully swept under the rug in the story above: a surrogate model needs to be trained and training takes time and resources. A lot of time and a lot of resources in fact. So, even if the digital assistant quickly generated a model of the dredging area, it would not be that easy to generate the following surrogate model.

One potential and very different approach to accelerate computation with machine learning is to use it as a solver for differential equations. Neural networks can be used in a similar way to finite element or finite volume methods but avoid the ‘curse of dimensionality.’ For systems with many dimensions, neural nets easily outperform traditional methods. Clearly, this approach is not yet as established as finite differences or finite elements methods, but there are already interesting results on using machine learning to solve the Navier-Stokes equations that are so central to civil engineering.

“One of the cameras on the vessel signals a problem.”

Like the use of computer-generated movies, the use of machine vision is already common practice today. There are many applications in quality control. The application for detecting the oil spill in the story above is actually a form of quality control. In this respect, the future is already here.

“It infers the location and size of the leak and the likely pressure in the pipe.”

This type of use is called inverse modelling. We could train a machine learning algorithm to predict which settings of a simulation correspond to which outputs. Admittedly, the story above is a bit of a stretch as an inverse model for that specific situation cannot be trained instantly.

“The digital assistant quickly scans all similar events and proposes a couple of ways to handle the situation.”

The fact that the AI has learned from past experiences with other oil spills is not something that is specific for computing and simulation but is a common benefit of AI in most applications. The only remark here is the necessity of sharing information about the handling of oil spills for the AI to learn. This might be a challenge given the competitive environment in dredging. Perhaps the government or branch organizations should play a role here.

“Initially, there is a high degree of uncertainty as the data about the incident is limited.”

Machine learning can be used for uncertainty quantification. If running a simulation with a surrogate model is cheap, then many of them can be performed, each with a slightly different input, to estimate the probability distribution. But even smarter methods exist to quantify uncertainty directly with neural networks.

“She quickly writes a couple of routines.”

Already today, large language models are used as programming assistants. Where we used to search around on platforms like StackOverflow to find out how to solve a programming issue, we can now simply ask ChatGPT or any other similar tool to show us how it’s done and give us a code sample. We can even ask it to optimize a piece of code and most of the time the result is decent enough. That saves a lot of time. Also, people in operations can now do more of the programming themselves without the help of actual software engineers. Whether this is a blessing, or a curse is perhaps a good topic for another blog.

“A claim from the authorities will come.”

The last paragraph is a bit of a side story. It hints at the legal pitfalls that will arise. The entire subject of liability in the use of AI will surely see much development in the coming years, certainly now that the European Union is working on the AI-act. But that topic is more than sufficient to fill many other blogs.

Finally

There is an extensive body of literature on the use of machine learning for computational science and engineering. Researchers have jumped on the possibility to discover entirely new methods that can provide better solutions to old problems or solve previously intractable problems. This new research field is known under various names, like Physics Informed Machine Learning or Scientific Machine Learning.

This blog post has given a taste of the potential impact of machine learning on computational science and engineering. This impact is only beginning to show. On the Gartner hype cycle for artificial intelligence 2023 terms like Decision Intelligence and Intelligent applications are marked as being two to five years away, and terms like AI Simulation and Operational AI systems are marked as being five to ten years away.

Partly, this is perhaps because the established numerical methods are powerful and proven and the new methods still come with many caveats. Therefore, industry will not immediately pick up on the potential. However, once people become aware that this new technology lets them do things that could not be done before, then for sure things will start shifting. Our world will no doubt be very different ten years from now.

If you are interested in learning more about the potential impact of machine learning on computing and simulation, feel free to reach out to us. We actively follow the latest published scientific results and developments, and we can help with implementing new techniques for operational applications. After all, that is what we’ve been doing for more than 27 years already.

Data-Assimilatie Digital Twins High Performance Computing Machine Learning Nieuwsbrief Numerieke Wiskunde Scientific Software Engineering