Skip to main content

Unveiling the potential of machine learning driven asset management

Insights from a project at Dutch railway operator ProRail

Learn about the ongoing impact of cutting-edge work leveraging machine learning models for asset management within the Dutch railway network. By developing and operating a mature machine learning system, KPMG supports ProRail in their journey towards data-driven asset management. Explore the real-world outcomes of these advancements, driving improvements in operational efficiency and safety enhancements within the railway industry.

Introduction

ProRail’s responsibilities include traffic control and maintenance of the Dutch railway network, which spans over 7,000 kilometers of tracks ([ProR-a], [ProR-b]). Ensuring operational readiness and safety of the railway network is a top priority. However, managing large volumes railway assets poses challenges. Notably, assets such as 11.5 million sleepers (in Dutch: “dwarsliggers”) and 46 thousand insulated rail joints (in Dutch: “elektrische scheidingslassen”) are critical to safe and reliable railway operations. As these assets degrade over time, due to factors such as the load of passing trains, weather conditions, and natural aging of materials, ProRail uses services of multiple contractors, who cover separate areas of the railway network. These contractors are tasked with carrying out the maintenance of the railways within defined requirements, to ensure safe operating conditions, by actively searching for – and fixing – defects and inconsistencies and completing maintenance as prescribed by ProRail.

Once work has been carried out, ProRail has the responsibility to validate and monitor the configuration (i.e. the physical layout and arrangement of assets) and condition of the railway network. At this stage, potential inconsistencies, such as placing the wrong type of asset, placing assets in the wrong location or not reporting replacements at all, might arise. In addition, ProRail is responsible for the operation of the railway network, which runs close to its maximum capacity during the day ([ProR-b]). Therefore, ProRail wants to avoid passenger and cargo delays by minimizing the downtime of the railways during the day for ad-hoc maintenance or visual inspection of the assets. The vast number of assets, their wide geographical distribution, and the limited time available for physical inspection present significant challenges for asset management.

Machine learning and asset management

To facilitate the asset management and monitoring of the railway network, images of the full railway network are captured twice a year by dedicated video inspection trains equipped with cameras positioned at various angles. One of these trains, operated by EURAILSCOUT, is shown in Figure 1. These cameras collect images with a sample distance of 20 centimeters, which amounts to a total of 500 million images of the railway infrastructure being generated annually, amounting to roughly 600 TB of data. Figure 2 shows example images of data collected with these trains. While these images allow for a digital inspection of the assets, the data volumes are so large that comprehensive inspection by humans is not possible – or desirable. This is because it would require significant time investment and presumably comes with a likelihood of human errors.

C-2025-2-Juffermans-1-klein

Figure 1. Video inspection train “Annet”, operated by EURAILSCOUT. [Click on the image for a larger image]

To automate and accelerate the monitoring of the railway assets using the inspection train images, KPMG supports ProRail by developing a machine learning system. This collaboration brings together a team of KPMG experts and ProRail’s product owner, working jointly within the Configuration Image Recognition team. The goal of this effort is to leverage machine learning models to detect, locate and classify various types of assets on the images captured across the country. Once assets are detected via computer vision, the information is used to update ProRail’s databases, to ensure they represent and match the physical world outside. In addition, the gathered information is aggregated and combined with different data sources to support various stakeholders within ProRail in their daily operations.

This data-driven approach for asset management serves as the cornerstone for transitioning railway maintenance practices from reactive and preventive to predictive and prescriptive measures. In other words, it will shift the maintenance approach from “addressing issues after they occur” and “routine maintenance” to “forecasting maintenance needs proactively” and “utilization of models to optimize operational strategies and prevent potential failures.” Through this digital transformation, ProRail aims to enhance the operational and safety conditions of the railway networks. Given that asset management impacts various operational processes, critical functions such as financial and maintenance planning also benefit from this shift. Consequently, the machine learning system complements ProRail’s vision to establish a digital twin of the railway network by the year 2040 ([ProR23]).

C-2025-2-Juffermans-2-klein

Figure 2. Overview images captured from the train. From top to bottom: the forward, downward, and right rail beam views. The red square in the forward view approximately overlaps with the downward view, while the blue square in the downward view roughly corresponds to the right rail beam views. [Click on the image for a larger image]

Addressing technical challenges within in the machine learning system

The extensive availability and coverage of the railway images presents an ideal opportunity to develop a machine learning system to detect, locate and classify various assets throughout the country. By training multi-object detection models, essential railway assets are accurately and automatically identified, along with their specific subtypes, see Figure 3. Each essential asset category, such as sleepers and insulated rail joints, encompasses a range of subtypes. For example, wooden sleepers can be relatively easily cut into irregular shapes for use in e.g. switches (Dutch: “wissels”), while concrete sleepers have a longer life-span and are more resilient under compressive loads.

While the training of the models started with a handful of different assets and subtypes per camera viewpoint, the system was designed and engineered to guarantee scalability. This means that additional assets and subtypes can be integrated into the existing model infrastructure, thereby enhancing the system’s scalability to supports stakeholders as their priorities and interest in different assets changes.

However, a machine learning application shouldn’t be relied upon blindly – its answers are only as good as the data that it can work with. For example, in situations where snow limits the view on the train tracks, the model’s performance may experience a temporary decline due to “data drift”, which refers to a change in data distribution. Experience has shown that model performance can also deteriorate due to changes in the camera angle. To address such issues, continuous monitoring of the models is implemented to identify unexpected declines in performance and alert the development and operation (DevOps) team if necessary. With proactive monitoring in place, the DevOps team can make the necessary adjustment to uphold desired performance levels.

To enhance the model’s performance over time, images are isolated in cases where the model encounters difficulty in accurately predicting certain types of assets. For example, instances with low confidence levels – indicating high uncertainty during detection – or overlapping detections – like when a sleeper is simultaneously categorized as wood and concrete – are singled out for expert annotation. By applying Machine Learning Operations (MLOps) principles, the precision, adaptability, and reliability of the image recognition system are refined and enhanced.

Since the image data is uploaded per region, large batches of data become available all at once. These sudden influxes of new data lead to spikes in the need for computing resources for image analysis. To effectively manage fluctuating workloads and optimize costs, the image recognition system was built on Azure, leveraging serverless computes from Azure Synapse Analytics and Azure Machine Learning resources for orchestration, model development and inferencing. By utilizing serverless computes, computing resources can be scaled up on demand, enabling analysis of large image batches upon upload. This architectural approach aligns with Data Operations (DataOps) principles to deliver new insights from the data to end-users.

To leverage the insights gained from the machine learning system, the asset detections made by the models are stored in a SQL database that encompasses a data model that represents the physical railway network. The database stores the location and (sub)types of all detected assets over time, acting as a ground truth for diverse applications and stakeholders. Among other applications, the database is used to validate and update various databases within ProRail and to generate reports for various end-users. In addition, the database also serves as a resource for another computer vision team, the Condition Image Recognition team, that assesses the condition of the detected assets.

C-2025-2-Juffermans-3-klein

Figure 3. Example output from the model, where the purple bounding boxes are NS 90 sleepers, and the yellow bounding box is a ATBVV beacon (an asset related to railway safety). [Click on the image for a larger image]

Business value and outlook

The data-driven approach to asset management serves as the cornerstone of railway operations. Notably, the machine learning system delivers value by automatically and accurately identifying railway assets and revealing discrepancies within ProRail’s asset management system. Additionally, it offers comprehensive insights into the configuration of assets across the entire network, enabling the enhancement of the existing database in two key aspects: improving objects accuracy and capturing previously unrecorded assets.

At present, ProRail’s asset management system categorizes sleeper types per railway segment, for example, the railway from kilometer 3.6 to 4.3 consists of concrete sleepers. In contrast, the machine learning system can identify each individual sleeper within the segment. By utilizing the data from each individual sleeper detection, the asset management system can be improved to more accurately represent the physical reality, offering a more detailed representation of the rail assets.

Moreover, the machine learning system provides the opportunity to identify assets that are currently not registered in ProRail’s databases. An example of these are thermite welds in the rails, which are used to fuse two sections of rails together after maintenance. While these welds are not problematic themselves, an overabundance occurring in close proximity can reduce the stiffness of the rails. The machine learning algorithm identifies these previously unregistered welds and helps ProRail establish mitigating measures to improve the longevity of the railway system.

Implementing a machine learning system drastically reduces the requirement for employees to maintain an asset management database at a consistent level of detail over time. A machine learning algorithm also does not get tired – and is less likely to make mistakes when properly trained. Through the development of a mature machine learning system, new assets, subtypes, and camera viewpoints can be seamlessly introduced and utilized to retroactively identify objects in old data sets. The flexibility and scalability of the image recognition system improves the quality of asset management by adapting to stakeholders’ changing priorities and interests in different assets over time.

An essential stakeholder benefiting from storing the individual detected assets within the SQL database is the Condition Image Recognition team at ProRail, tasked with analyzing the condition of assets. This team evaluates the presence of cracks within concrete sleepers ([ProR20]), for example. Since the machine learning algorithm detects and classifies sleepers individually instead of having the types registered per segment, this team now has direct access to the location of individual concrete sleepers.

By consistently storing the location, configuration and condition of railway assets within a SQL database, ProRail has the means to analyze asset degeneration over time. As the database accumulates a substantial amount of high-quality data, the development and implementation of robust models can shift railway maintenance strategies from reactive and preventive approaches to predictive and prescriptive methodologies, hereby revolutionizing railway operations and maintenance.

This groundwork serves as the initial phase towards creating a complete digital twin of the railway infrastructure, facilitating real-time monitoring, predictive maintenance, and informed decision-making. By integrating diverse data sources over time such as, for example, lidar that generates a 3D point cloud of the railroads and vibration data utilized for classifying the ground layers below the rails, the digital twin will offer a thorough and current portrayal of the railway network, enhancing operational efficiency, safety and strategic long-term planning.

In addition to the images acquired from trains, the team is currently broadening the scope of the machine learning system to encompass the detection, localization, and classification of assets in images captured by helicopters. Twice a year, the railway network is aerially surveyed by helicopters equipped with cameras that capture images from diverse angles. Train-captured images are ideal for spotting smaller or track-adjacent assets. In contrast, helicopter-captured images are better suited for identifying larger assets that span multiple tracks or are located farther from the rails, such as overhead line portals and signs. Like the insights obtained from train images, the assets identified in the helicopter images are used to enhance the completeness of ProRail’s databases, support stakeholders, and facilitate various projects. Looking ahead, the integration of drone-captured images into the machine learning system may further bolster its capabilities ([ProR20]).

Conclusion

The integration of a mature machine learning system into railway asset management represents a significant leap forward in operational efficiency and safety standards within the Dutch railway network. By using multi-object detection models and data-driven maintenance strategies, organizations like ProRail build robust and future-proof maintenance approaches, enabling predictive and prescriptive methodologies for optimal asset performance.

The practical implementation of this cutting-edge system offers tangible benefits, driving cost savings, improving infrastructure uptime, and enhancing overall decision-making processes. In the near term, ProRail aims to leverage the Machine Learning system to detect most of its rail assets on images captured by both trains and helicopters. Looking ahead, the strategic outlook is promising, with a clear trajectory towards establishing a digital twin of the railway infrastructure, where data-driven insights will continue to shape the future of railway operations. As the industry embraces these innovations, the transformative impact of advanced image recognition technology in railway asset management is set to redefine industry standards and elevate operational excellence to new heights.

References

[ProR-a] ProRail (n.d.). Spooronderhoud. Retrieved from: https://www.prorail.nl/over-ons/wat-doet-prorail/spooronderhoud

[ProR-b] ProRail (n.d.). Verdeling van de capaciteit. Retrieved from: https://www.prorail.nl/over-ons/wat-doet-prorail/capaciteitsverdeling

[ProR20] ProRail (2020). Data & duurzaamheid: een code gouden combinatie. Retrieved from: https://www.prorail.nl/nieuws/data–duurzaamheid-een-gouden-combinatie

[ProR23] ProRail (2023). De visie op digitalisering van ProRail: Spoor naar morgen. Retrieved from: https://www.prorail.nl/siteassets/homepage/over-ons/ict/prorail-visie-op-digitalisering-1.pdf