Digitalization is an important topic in many engineering companies, utility companies and environmental agencies. Data science is a key technology for digitalization. Find out how data science is applied in the engineering sector.
The role of data science in engineering
The engineering sector is in the middle of a data science revolution. Digitalization is the trend in asset management. Digitial twins are the hype of the day. Data science is changing the way we monitor, operate, optimize and maintain assets. Not only in industry, but also in utility companies and in civil engineering.
From sensor to information
Sensors are getting cheaper, connectivity is almost ubiquitous and storage is abundant. This makes it relatively easy to collect all sorts of status information from assets. Assets can be industrial installations or engines. But also distribution networks, bridges, roads, canals, pumping stations and lots more.
However, sensor data is notoriously noisy and error prone. Sensors fail or drift from their calibrated settings. Connections may fail. And the systems to collect the data have their own problems once in a while. Also, data may be collected in tremendous volumes (big data) where only a fraction of that volume is actually useful.
Data engineering is used to transform the raw data into a reliable data feed. It cuts down the data to a useful volume and it can detect various sorts of errors. It can correct these errors or remove the incorrect data items. If necessary, estimates can be made for missing data (imputation).
Note that data generated by sensors is not always numerical. In many cases important information is available in log files. Sifting through large volumes of text data and finding exactly the relevant entries is also best done by data science in combination with text mining. The data can also come in the form of images. Then, advanced image recognition methods can be used to extract information for further processing.
Extending sensor data to a full 3D or 4D view
Even though sensors may be abundant, there is often still a need to grasp what is going on between the sensors. Or to get insights in parts of the asset that are not open to sensoring.
In those cases, a computational model of the asset can be used to interpolate or extrapolate the data to unseen locations or times. A computational model will generate physically meaningful estimates of the unseen parts of the system. Estimates that are consistent with the data that is actually observed.
The techniques to do this are collectively known as data assimilation. VORtech has applied these techniques in a wide variety of cases: from building a real time 3D field of the temperature in a data center (see our October 2017 news letter) to locating problems in sewer networks and water distribution networks.
Monitoring and operating assets
But just collecting data is hardly useful as such. It becomes valuable only in the way it is used.
In many cases, the data needs to be presented to the operators in a convenient way. This requires the development of a dashboard that shows the relevant metrics to operators and allows them to dig into the data to understand what is going on. Standard dashboard platforms are available. But proprietary dashboards are also relatively easy to build using modern web technology. For example, VORtech developed the WBViewer for the Dutch Department of Waterways and Public Works.
Behind a dashboard is usually a collection of services based on data science. These create derivative information that will help the operator to focus on what is really important. For example, data science can be used to detect abnormal situations and trigger an alert. An example of this is the work that VORtech did on the Dynamic Bandwidth Monitor of Vitens. This generates an alert when there is evidence of a leakage in the water distribution network.
Artificial intelligence techniques can be used to assist operators. It can suggest advisable actions, either based on business rules or through machine learning algorithms that have been trained on past information. Also, it can be used to determine the root cause of failure or malfunction, for example through Bayesian networks.
Control and optimization
Data science plays an important role in the control of installations. Based on what it learned from data in the past, a machine learning algorithm can predict the future development of a system and determine the appropriate actions.
A nice example (based on reinforcement learning) is where a neural network is trained by applying small variations on the settings of a system and then observing the response of the system. This neural network can then automatically control the system to keep it in an optimal state. Such an automatically controlled, smart system can be seen as a form of artificial intelligence.
But this approach will fail if the optimal setting of the system is too far from the actual setting to be reached by small variations. Or if the system is too critical to allow this kind of experimentation. In that case, a computational model that accurately reflects the underlying processes of the system is often used to find the best setting using various optimization techniques.
But such models are often heavy in terms of computational load. One way to deal with this is by using approximate or reduced models. Here, again, machine learning can be helpful. A neural network can be trained on data that is generated by the computational model so that it becomes a fast approximation of the full model. In an operational setting, the trained neural network can be used to determine the optimal settings of the system.
Predictive maintenance was one of the first applications of data science in engineering and is still one of the best known.
Typically, the data scientist will collect a large data set of system failures and, through data science, find early warning signals that predict an impending failure. If a the prediction algorithm is accurate enough, it can be used during operations to predict when the system will fail and plan maintenance accordingly.
Usually, this approach will lead to much lower maintenance costs in comparison to predefined maintenance moments that are usually planned too frequently just to be sure that no failures will occur. But it should be stressed that it is not always easy or even possible to collect the right amount of data on failures.
Are you interested?
VORtech is a company that operates at the front of technology for prediction and analysis in the engineering sector. Data science is a very powerful extension to our traditional toolbox of mathematics and computer science that we have been leveraging for over 20 years. Feel free to contact us if you would like to discuss how data science can help you in your engineering challenge.