Operational forecasting with machine learning
Making forecasts is an important activity for most of our clients. For some it’s about predicting the future, for others it’s predicting the effect of some action.
In the early years of VORtech, such forecasts were usually based on mathematical models. These were developed by experts and programmed by our specialized scientific software engineers.
Since then, machine learning has emerged as a powerful new forecasting technique. It lets the computer generate the model based on data; an explicit model is no longer used. VORtech closely followed the development of machine learning, applying new methods as soon as they were mature enough. Not only out of a fascination with the technology but also because machine learning is the better option for some forecasting challenges.
From a software point of view, large parts of a machine learning application are the same as in more traditional model applications such as the tidal prediction system of Rijkswaterstaat. Much of the software is still dedicated to collecting and processing information from external sources, for dashboards (like the WBViewer) and for distribution of results to other systems. And yet, a machine learning application is also different from a traditional forecasting system.
Machine learning engineering
The differences are both in the development of the model and in the operational context. In the development, different and more powerful tools are used. In the operational context, facilities need to be in place to monitor whether the model is still correct with the currently incoming data. That is because a model can quickly become unreliable if the statistical distribution of the incoming data changes. Or if the format of the data changes.
VORtech has experience with the development of machine learning applications or what is called machine learning engineering. We know how to develop a model, bringing to the table our experience with sensor data and our understanding of physics. Also, we develop the entire data-pipeline. We build data collection services, we implement the quality monitoring and we organize the distribution of results.
Developing machine learning models
It starts with the right question
One thing we’ve learned over the years is that perhaps the single most important step in developing a machine learning application is formulating the right question to be answered. That is why we take time to discuss with the client what the problem is that needs to be solved. With our broad understanding of science and technology, we quickly grasp what the clients’ experts want, even if they are from a different field of expertise.
Often, the initial problem formulation is not the correct one. Or the original question can not be answered with the data that is available. Therefore, this first phase is essential. Formulating the right problem takes you halfway. Working with the wrong question will never get you anywhere, no matter how hard you try.
Our blog about defining valuable use case of big data describes our approach.
Data is not always good data
In most cases, we have to work with the data from sensors that are already in place. Therefore, our next step is to see what the data is and what we can do with it. This again requires a close collaboration with the clients’ experts and operators. Together we learn to understand and interpret the data. In some cases, we will advise to install additional sensors if essential information is missing.
Sensor data is notoriously messy. Sensors are not always properly calibrated; connections may fail and the systems that collect the data may have their own issues. Furthermore, sensors tend to produce large volumes of data (big data) whereas only a very small part of it is really useful.
An important task in machine learning engineering is processing this raw data into a reliable and efficient data stream. This is called data engineering. It includes the filtering of the data, leaving only the relevant parts, and detecting errors and artefacts. For example, in one project we have developed algorithms that automatically detect faulty sensors. In some cases, errors can be repaired by guessing the missing data (imputation). Part of data engineering is also combining different types of data into derived features that are more useful for machine learning than the original data.
The model itself
VORtech has experience with a wide range of machine learning and statistical methods. The challenge is usually in finding the right combination of data and algorithm. If possible, we prefer to use a machine learning technique that is easy to interpret. This not only helps to have the algorithm accepted by operators, but it also gives insight in its reliability.
In many cases, a relatively simple model is already powerful enough, provided it is fed with the right features. But if really specialized knowledge about a particular method is needed, we will not hesitate to involve an external expert. We know our limitations.
The data pipeline
As mentioned before, we can build or improve a data pipeline for you. We know how to collect data from external sources, and we are good at developing efficient algorithms for processing data into features. Using modern software platforms, we develop intuitive dashboards and we know how to distribute results to external systems.
Use cases for machine learning in the technology sector
For inspiration, here is a couple of well-known use cases of machine learning with examples of projects that we have worked on.
- Process Optimisation
Machine learning can help to optimize processes. For example, by making predicitons that operators can use to adjust the settings. A nice example is the work that we have done around planning of home visits for maintenance engineers. It turns out that there are patterns in the moments when people are at home. Using those patterns for planning will avoid many unsuccessful home visits.
- Support for Operators
Machine learning can detect unwanted changes in processes and thereby give useful information to operators. For Vitens, we have worked on algorithms to quickly detect leakages in the water distribution network and to also pinpoint the most likely location of the leakage. This allows engineers to quickly find the leakage and repair it even before anyone starts complaining about problems in the water supply.
- Predictive maintenance
Machine learning can be used to determine how long an asset can continue to function without problems. A challenge for this type of application is that there is usually very little data of actual failures. Therefore, these methods tend to focus on proxies that are indicative of an impending problem.
- Quality Control
Computer vision has become a standard technology to detect artefacts in products. A computer these days is better and faster than humans at recognizing quality issues in products. And it is even cheaper.
Are you interested?
VORtech is a company that operates at the front of technology for prediction and analysis in the engineering sector. Machine learning is a very powerful extension to our traditional toolbox of mathematics and computer science that we have been leveraging for over 20 years. Please contact us if you would like to discuss how machine learning can help you in your engineering challenge.