Skip to main content

Governing the Amsterdam Innovation ArenA Data Lake

Enterprises feel the need to create more value out of the data they are collecting, as well as the data that is openly available. Traditional data warehouses cannot support analyses of multi-format data, giving rise to the popularity of data lakes. However data lakes require controls to be effective, data governance is of utmost importance to data lake management. Amsterdam ArenA is an example of an enterprise that joined this movement, paving the way to the creation of a smart city.

Data-driven innovation

The collection of data is greater than ever seen before. Over 2.5 quintillion bytes of data are created every day ([IBM16]) and this number is rapidly increasing. In fact, the last two years alone have seen more data created than the entire history of the human race. In line with the rapid increase of data collection, data analytics has become an increasingly popular topic. Data analytics refers to the business intelligence and analytical technologies grounded in statistical analysis and data mining. Although increasingly popular, less than 0.5% of all data has ever been used and analyzed ([Marr15]) demonstrating that much of the potential is still untapped.

Organizations are not letting the potential value slip away, 75% of organizations either have already or are currently implementing data-driven initiatives ([GART16]) ([IDG16]). The aim of these initiatives is to increase operational efficiency, improve customer relationships and make the business more data-focused. However, this is only part of the potential, as business analytics generally only considers structured data that companies collect about their operations. Innovation is about thinking outside the box, ideally it would include more than structured internal data. The possibilities when combining different datasets of different formats are endless.

One such data-driven innovation application is the development of data-driven Smart Cities. Traditional cities are extremely inefficient in terms of waste. Smart Cities aim to better control the production and distribution of resources such as food, energy, mobility and water. This can be achieved through the means of data collection and analytics. For instance, real-time data about traffic can be used to suggest alternate routes for drivers or supply levels can be altered to better meet demand based on historical purchasing data. These are just a few examples of how data can increase a city’s efficiency. Amsterdam ArenA, is an example of an organization that decided to join this movement. They have started to make use of the data that they and their partners have been collecting for years in new ways and switched their focus from the optimization of individual systems to the creation of effective network systems. This will lay the foundation for the creation of a Smart Stadium and eventually, a Smart City.

ArenA launched an initiative called the Amsterdam Innovation ArenA (AIA) that provides a safe, competition-free, open innovation platform where companies, governments and research institutions can work together to make quick advancements and test smart applications and solutions. The stadium and its surrounding area serve as a living laboratory, a hotspot where innovations are tested in a live environment. Amsterdam ArenA has a data lake which stores a large array of data that is collected internally, this ranges from Wi-Fi location data, solar panel data, video camera data, and much more. They have also installed a data analytics platform. The platform allows projects to be carried out in data labs. Data sources are gathered from the lake and combined in these analytical environments.

C-2017-1-Jeurissen-01

Figure 1. Platform Scope.

Unfortunately, integrating a number of different datasets is more complex than it sounds, think about combining video camera data (unstructured data) with a table in an Excel sheet (structured data) for example. It poses risks to the ArenA as an organization, in terms of compliance to data privacy legislation, but also the misuse of data for purposes or analyses it was not intended for. Therefore, it is important to control the use of the platform, but without lowering the innovative value of the platform, the ‘data playground’.

Data Warehouse, Data Lake: what is the difference?

The vast majority of collected data is of an unstructured nature. There are four main types of data; structured (formal scheme and data model), unstructured (no predefined data model), semi-structured (no structured data model) and mixed (various types together). Currently, only about 20% of all data is structured ([GART16]). Yet, traditional data warehouses only support structured data, meaning that the vast majority of collected data cannot be stored for analytical purposes. To resolve this, enterprises have begun using data lakes. A data lake is a storage repository that holds a vast amount of data in its native format. Neither the structure of the data nor its requirements are defined until needed. Unlike a traditional data warehouses, data lakes also support the storage of unstructured data types. In traditional data warehouses data is cleaned before it is stored, not having to do this when storing data in data lakes saves both time and money. Instead of having to clean all the data, analysts only have to clean the data that is relevant for their analysis. The costs of data storage are significantly smaller in data lakes as the architecture of the platform is designed for low-cost and scalable storage. However, there are two drawbacks to the use of data lakes, as they are still a relatively new topic the security standards of data lakes are not as high as those of data warehouses. Moreover, using mixed data formats requires experienced and skilled data scientists, who are often not present in the average organization ([Dull17]).

C-2017-1-Jeurissen-t01-klein

Table 1. Data Warehouse vs. Data Lake: key differences. [Click on the image for a larger image]

The trade-off between innovation and control

Ideally, anything would be allowed when analyzing the data in the data lake, however in practice this is impossible. On the one hand, enterprises should strive to store as much data in the lake as possible and let users have full freedom to innovate. Imagine combining real-time social media data, sales data and personalized promotions for example. The combination of this data would allow a firm to offer satisfied or dissatisfied customers (based on social media behavior) special promotions when sales are down. On the other hand, both data storage and user activity need to be controlled. There are legal mandates about the maximum storage time of specifics types of data. Video camera data may only be stored for a maximum of 4 weeks, and often even as little as 48 hours ([AUTP17]). When working with external suppliers, the data lake provider should also inspire confidence and reflect that they have control over the data lake. Data suppliers will not share their data on a platform where users handle data without restrictions. So how does one find the balance between innovation and control?

ArenA also faced this challenge when implementing their data lake and data labs. Implementing the data lake on an organizational scale, and allowing not only internal but also external users to make use of the data lake, poses risks to ArenA. Users should be given full freedom to stimulate innovation yet ArenA should maintain control over user activity to ensure data is used appropriately. ArenA also faced challenges considering privacy regulations, as part of the collected data is customer-related and saving it in its raw format infringes privacy regulations. To overcome these challenges, we developed data governance around the data lake. This enables all (external) parties to become data-suppliers in a safe and reliable manner.

Overcoming related challenges: Amsterdam ArenA

Besides the trade-off between innovation and control, two of the most common challenges enterprises face when implementing a data lake is maintaining control of what is saved and finding the right people to carry out analyses. If everything is blindly saved in the data lake, data is simply being stored and never looked at again. Actually getting value from the data is the responsibility of the end user, increasing the risk that the data becomes a collection of disconnected data pools or information silos. This phenomena is also referred to as the creation of a data swamp rather than a data lake ([Bodk15]). Try to make use of it and you will drown. Data lakes require clear guidelines on what will be saved, data definitions and quality rules. Furthermore, carrying out analyses on a wide range of different data sources requires highly skilled analysts. An assumption which is often made is that data lakes can be marketed as an enterprise tool. It is said that if a data lake is created, employees will be able to make use of it, assuming that all employees have the skills to do so. The average company has a limited number of analysts or data scientists on their payroll.

The ArenA overcame these challenges when implementing their data lake. Firstly, the recruitment of skilled staff with knowledge of existing analytics methods and applications. Amsterdam ArenA overcame this issue by creating an open-sourced analytics platform. AIA does not rely on employees alone, as they have made the platform accessible to everyone. Due to the large variety of data and the innovative nature of the platform AIA does not need to make use of generalized data quality rules. Data preparation is the responsibility of the end user. In order to avoid creating a data swamp, Amsterdam Arena stores data in the data lake using metadata (such as date, content and event information), making it easy to find specific datasets quickly and combine datasets on an event-basis. Furthermore, only unique and interesting moments of video camera data are stored, scrapping the large amounts of valueless data (e.g. video footage of an empty stadium). The data analytics platform is built on top of the data lake, and data is only loaded into a project analytical environment is a project is initiated.

The need for data governance: finding the balance

Data governance is an overarching concept which defines the roles and responsibilities of individuals throughout data creation, reading, updating and deletion. Since data lakes employ a large array of data sources, clear rules must be laid down to control operations and to comply with legal regulations. With the establishment of increasingly strict laws in the privacy domain, companies must be especially mindful that their big data operations may not be compliant. Not only must companies be compliant, they need to protect themselves from possible future developments both within their firms as well as in the market. If something goes wrong, who is held accountable? Is sensitive personal information being analyzed? Who has ownership of the data? Who decides what data may and may not be used for? What controls are in place to ensure that data is used for the right purposes? Who deletes the data once it is no longer needed? Many questions arise when considering effective big data management. These questions can be answered by implementing data governance within an organization.

Implementing data governance

The first and fundamental element of governance is a virtual organization that defines the roles and responsibilities with regards to the handling of data. Depending on the size of the organization, a number of data related roles are defined. These are distributed over three levels; strategic, tactical and operational. Generally, all accountability and data strategy decisions are made at a strategic level and day-to-day decision making takes place at a tactical level. Daily operations such as data analysis and authorization management take place at an operational level. The strategic vision serves as a guideline for what new data is added to the data lake and which projects are carried out in the data labs.

Based on the defined roles, the data lifecycle process is defined step-by-step for every activity that takes place during the process. This ensures that all individuals with data-related tasks know what their responsibilities are and which activities they need to carry out. The making of such a process flow also clarifies where the potential risks in the process lie. This allows organizations to hedge against potential risks and implement the necessary controls to mitigate them.

C-2017-1-Jeurissen-02

Figure 2. Governance Documentation.

At a tactical level, clear agreements must be made about the data and its use. This is important in order to create stable partnerships and inspire trust from external parties. Hence, data delivery agreements are made between data suppliers and data receivers. In this agreement the responsibilities of both parties are explained and agreed upon. The agreement also specifically defines both the expected content (attributes) and permitted uses of the data in question, which offers insight into the potential applications of the data and allows the organization to keep track of data and data attributes in the data lake preventing the creation of a data swamp.

At an operational level, the use of data must be controlled. Is data usage in compliance with the terms of the relevant data delivery agreement? For this reason, data usage agreements are signed by all users of the platform. To assess whether all users truly act according to the terms laid down in the data usage agreement, one of the roles in the data governance model is that of a controller, who is responsible for the continuous monitoring of user activity. An ideal analytical platform also logs all user activity making it easy to identify any wrongdoings.

Finally, there is one last control mechanism that is part of a data governance implementation; the authorization matrix. This document gives an overview of all available data and who has authorizations over the data. At any moment in time, the authorization matrix can be used to assess whether all necessary formal documentation reflects the current state of the platform.

C-2017-1-Jeurissen-03

Figure 3. Data Lifecycle Process.

The innovative governed data lake

The Amsterdam ArenA now has a running platform on which a wide variety of data is saved. From the data lake, data is periodically migrated to either the innovation platform or specific data labs for projects. These data labs are set up on a project basis when requests are made by external parties. The entire process is strictly governed and accurately documented through the means of data governance. Based on existing documentation, it is easy to assess in which stage of the process the various projects lie. During the implementation it became clear that both the data type and data transaction may influence what documentation is required. Data leaving the ArenA platform for example requires more documentation. Overall, partners feel confident in sharing their data with ArenA and many new partnerships are excepted in the near future. Similarly, there has been great interest in the platform from the user side, such as students who have been using the platform for university projects. A use case of one such project was that of students who analyzed purchasing data. They used the large food and beverage dataset to discover patterns in purchasing behavior at Amsterdam ArenA in relation to external factors such as weather, event type and customer demographics. ArenA’s analytics platform is a safe innovation playground that will play a role in developing the data scientists of the future.

Concluding

Enterprises are creating more value from data, whether this is structured, semi-structured, unstructured or mixed. Combining all available information in a data lake creates innovative ideas and new insights that will add value to the business as well as society. With data comes knowledge, and with knowledge comes power. To avoid this power being abused data lakes require data governance. Implementing data governance allows enterprises to stimulate innovation within their firms without risking loss of control over user activity.

 

Editorial

Data Management: ‘If you think good data is expensive, try bad data.'[Adapted from Brian Foote & Joseph Yoder.]

 

In 2012 we published our Enterprise Data Management special in Compact for the first time.[https://www.compact.nl/publicatie/2012-2/] We concluded our editorial saying: ‘We hope that reading this Compact may provide you with a moment of enlightenment and that you will allocate data the place on the agenda that it deserves.’ And so it did! In 2014 another EDM issue of Compact about the value of data was released.[https://www.compact.nl/publicatie/2014-2/]

During the last four years within KPMG we have seen ‘data’ becoming a major topic in the boardroom. Driven by compliance, growth or efficiency, data became an ever present element in discussions at the highest level. However it is fair to say that those in command may not always have been aware of the fact that data was and is a precondition for many, if not all, compliance, growth or efficiency initiatives. Nevertheless through those initiatives they became the sponsors for setting up and improving data management.

Examples of data projects at the back end of business programs and projects are being discussed in this edition of Compact. Akça and Biewenga describe the challenges organizations face when they have to deal with large scale data migrations. Data governance is an ever present topic in every data related project. Jeurissen and Martijn discuss data governance in the predictive analytics environment of the Amsterdam Innovation Arena. Laws and regulations have become a driving force for many data initiatives in the last few years. Solvency and Basell have explicit requirements for data quality. The BCBS 239 regulations provide even more explicit guidance for data management. Rothwell et al. provide an end to end view on data quality. Data quality has specifically become a topic in the Food Retail industry. The quality of food related data has been a challenge for the industry for many years. Stakeholders participating in GS1 have developed new initiatives to increase the quality of food data. This is further dealt with by Van der Ham, Van Rijswijk and Swartjes. Organizational models for managing data in organizations are a constant element in discussions with our clients. Van der Staaij and Tegelaar provide us with their views based on practical experience in their contribution to this Compact. Many of our clients are looking for new business models to capitalize on all the data that they have available. Potentially enriching internal data sets with external data. This is the topic of Verhoeven’s contribution. Lastly Martijn and Tegelaar provides us with their view on document management, archiving and retention.

We are convinced that only those organizations that are able to manage and control their data will survive. While others who do not see the need and necessity to do so will fade away. The latter will not be able to comply to laws and regulations and will be faced with litigation and fines. They will not be able to provide their customers with a seamless digital journey, which will drive their customers to their competitors. And they will be faced with inefficiencies in collecting, storing, exchanging and archiving data. Leading companies have already discovered that bad data is more expensive than good data.

Ronald Jonker

Partner Enterprise Data Management and Guest Editor

ERP Data Migration

The driver for data migration is often based on the business decision to modify the ERP landscape. Apart from the necessity and the complexity involved, data migration is a value adding activity for your organization. It is pivotal to take the time to consider the factors that play a role in determining the most suitable migration approach. Data migration costs, system performance, outage duration, data volume, data quality, data retention and the enablement of new system functionality are all factors to take into consideration when choosing your migration approach. This article describes two alternative approaches and subsequently, how these approaches cope with the described factors.

Introduction to data migration

Data migration is a systematic and phased approach for transferring data between (multiple) systems based on a specific migration methodology. The driver for data migration is often based on the business decision to modify the ERP landscape.

Data migration is a value adding activity for the organization. Moreover, an end to end data migration exposes the full data landscape of an organization in an unique way. Going through the migration process as such can reveal many unknowns in organizational data and data relationships and enables mapping of the full organizational data landscape. In addition, data migration supports the identification of data owners and can be an improvement or a solid start for setting up data governance within your organization.

It is worth mentioning that data migrations are always complex in their nature. It is of critical importance to design internal controls that provide assurance regarding completeness and accuracy of the data selected for migration. A solid and extensive data reconciliation framework including checks and balances is of crucial importance in order to verify the completeness and accuracy of the migrated data. Based on our experience, data migration consumes on average 15-25% of the project implementation budget.

With regard to the ERP data migration we have identified two approaches based on our industry experience. These two approaches will be explained in detail within this article. Both approaches have their own specific complexity and challenges.

The transaction driven approach is the most common approach for an ERP migration. In this approach data is extracted from the legacy system and converted into a load file which in turn can be loaded with the use of standard load programs provided by the ERP system. Often the scope of data migration is limited to active data and is therefore more cost effective when compared with the table driven approach.

The table driven approach is less commonly used as this entails the full transfer of tables without any selection criteria. Based on our experience for a multinational client, we have an interesting real-time case where the table driven approach is implemented successfully in an SAP ERP environment.

Respectively, the transaction and table driven approach will be described in the following paragraphs by explaining the two approaches and their inherent benefits and challenges.

Transaction driven migration approach

In the transaction driven approach the migrated dataset is often limited to active data. The active dataset is extracted from the legacy system and converted into load files. These load files can be uploaded with use of standard load programs or API’s provided by the ERP system. By making use of the standard load programs, transactions are processed in the system as if they are new transactions. This approach leverages functional consistency and completeness checks embedded in the ERP system/load programs. In addition, any error messages from the data load provide useful information both on transactional and associated master data. These error or warnings messages from the loading programs provide insight into the:

  • changes required to the conversion logic;
  • enrichment that needs to take place;
  • required cleansing activities.

As a consequence of the transaction driven approach, a data warehouse must serve as a data repository for historic data instead of the new ERP system. This approach also makes it necessary for the customer to maintain the legacy system in read-only mode after go-live in order to comply with local legislation on data retention timelines. However, we see that the need for maintaining the legacy system after go-live, decreases rapidly over time. After one year, the average system usage is generally restricted to one or two users for ad-hoc system requests.

As mentioned earlier, when clients are implementing a new ERP system, we advise limiting their data migration only to active data. In this article, we explain the rationale for limiting the migration dataset to active data and the definition of active data.

But what is active data? Active data consists of active transactional data, active master data and general ledger balances. An early data migration scoping is essential for a successful ERP implementation. This scoping exercise starts with defining your active transactional data. Active transactional data consists of financial transactions which are not yet complete at the time of migration or could lead to financial transactions in the future.

It is important to agree on a timeline when defining the future transactions and to set this in stone. For example, future transactions that will happen within six months after go-live. Active master data is derived from the scope of the active transactional data. Hence, master data required to process active transactional data defines active master data.

Although some clients prefer to also migrate historic data, there are clear arguments on why clients should do otherwise. The most prominent reasons to not migrate historic data are related to the effect of higher data volumes in the migration on the performance and stability of the system, the system outage duration and associated higher cost of the migration.

Data migration costs consume a large portion of the implementation budget (15-25% on average). This is caused by the necessity of multiple data migration cycles. These cycles are not restricted to exercise the data migration. It is also aimed for support, integration and user acceptance and regression testing.

Additional cost drivers for data migration are the setup of data migration logic, conversion logic and reconciliation scripts. This setup is required to assure the completeness and accuracy of the migrated dataset. This setup covers:

  • extracting data from the legacy system to a staging environment;
  • converting and enriching data to the structure required to load into the new ERP system;
  • reconciling the end-result based on completeness and accuracy of data.

C-2017-1-Biewenga-01

Figure 1. Transaction driven data migration approach.

The complexity of the data migration significantly increases if historic data is also part of the data migration scope. This is due to the fact that organizational data structures are subject to change over time. Examples are changes in cost center structures, employee positions, VAT and the Chart of Accounts. This adds a lot of complexity to the data migration and reconciliation logic required. Especially as the new ERP system is likely to have a different data structure than the legacy system.

Along with the earlier described higher implementation costs by including historic data, we also see that data from legacy systems are blocking new ways of working in the new ERP system. This is caused by the fact that functionalities are different or are used differently in the new system. Hence, less data is an enabler for functional improvements.

With regard to data quality, the new ERP system requires proper data quality before any migration activity. This requires an intensive data cleansing process and in some cases housekeeping activities (examples: clearing of open items and/or enriching master data in the legacy system). The inherent advantage of limiting the migrated data to active data is that the scope of cleansing activities is limited to the boundaries of the selected dataset for migration.

The last argument for limiting the migration scope is related to the production system outage duration. The production outage at cutover, is likely to increase if higher data volumes are considered. An increase in production system outage duration can drastically increase costs due to, for example, higher idle time of machines in factories.

Table driven migration approach

The alternative to a transaction driven migration approach is a table driven data migration approach. In this approach all the tables in the legacy system are migrated and mapped one-to-one to the tables of the new ERP system. However, this approach is restricted to migrations within the same ERP vendors, as table structures need to be aligned between the source and target system. These table structures and their referential integrity are key complexity factors inherent to the table driven approach. In the table driven approach, historic data residing in the legacy system is migrated completely (with no selection criteria) to the new ERP system.

As mentioned above, the table driven approach has its challenges and often results in significantly higher data migration costs.

One of our clients performed a successful table driven data migration. The client used cutting edge SAP technologies to overcome the challenges of the table driven approach. Our client is a multinational, one of the biggest SAP customers, and has one of the largest transaction volumes in the world. In close cooperation with SAP, our client performed a complete migration from SAP R/3 to SAP Business Suite on HANA. This was followed by an upgrade to S/4HANA Finance. We provided our client with quality assurance (advice: attention and improvement points) during the complete transformation.

To overcome the challenge of high data volumes, data archiving activities were intensified and data retention times per operating country were revised at the client side. This drastically decreased the scope of data that needed to be migrated.

A table driven data migration approach also requires different types of checks on the completeness and accuracy of data. Consistency checks need to be built into the migration tooling, to assess data consistency between tables and data objects before loading. As data was not validated via ERP embedded functionality, measures needed to be taken to overcome potential data inconsistency. All of these measures create the necessity for a data reconciliation framework that includes checks and balances to verify the completeness and accuracy of the migrated data.

The upgrade program of S/4HANA Finance performs built-in consistency checks between data elements that are embedded in the S/4 HANA Finance program. This resulted in mandatory extensive housekeeping and cleansing activities over a six months period covering many years of historic data.

To manage an acceptable outage duration whilst migrating the full table scope, a more advanced technology was required for our client. A technique that can be used to divide the data volumes in several migration loads is the Near Zero Down Time (NZDT) technique developed by SAP. NZDT places triggers (points in time that record the changes in tables after the trigger setting) on SAP tables which make it possible to divide the full data loads into several increments. This technique allows the start of the data migration when the legacy system is still in production.

Due to the intensity of this data migration effort and to not complicate the data migration further, no significant functional improvements were made at the client during the migration project itself. The client has initiated an additional improvement program (which the migration to the S/4 HANA platform is a pre-requisite of) to get the full benefits from their new ERP system. This additional 3-year program will facilitate the functional improvements which are ready for implementation on the new platform. A full redesign of the SAP business processes (end-to-end) is an example of these functional improvements at our client that will be performed after migration to the S/4 HANA platform.

It is certainly worth mentioning that the migration described above is extremely complex in origin and unique in nature although it is a migration between platforms of the same ERP vendor (SAP).

This approach requires a full description of each table and its associated migration and transformation logic. This was documented in the data migration blueprint. This requires advanced ‘in-house’ knowledge to develop a data migration blueprint.

As mentioned earlier, due to the high data volume, the data reconciliation process and the effort to verify the completeness of the migrated data was also higher, when compared with the transactional approach. Inherent to the reconciliation approach, an incident handling method is absolutely necessary to derive information and value from errors occurring during the migration process.

The incident management process at our client helped to categorize, address and resolve the test incidents that resulted from either the migration or the reconciliation. Both the migration and reconciliation at this client were performed using customized SAP tooling.

Comparison of the data migration approaches

As mentioned data migration can add value to the organization. The value of data migration can be categorized and explained on high-level in relation with the below pillars:

  1. Accuracy: data is in the system(s) of choice within the organization. Increased data integrity results in less data redundancy and less appearance of bad source data thereby making data within the organization more accurate.
  2. Completeness: transactions in the organizational database are more enriched as the data within the organization is more categorized in system(s) and maybe even centralized.
  3. Compliance: data migration transfers data and/or merges data in systems which ultimately has a positive effect on data integrity. However, it is crucial to be compliant with tax, local and statutory requirements regarding data retention.
  4. Efficiency: data migration results in lower operational cost as a result of the renewed IT infrastructure. Increased data integrity can result in a decline in searching costs of individuals and an increase in organizational agility.

The two data migration approaches are compared based on these pillars. For each pillar specific characteristics and requirements come into play for each migration approach. The table below provides an overview of these requirements and characteristics.

C-2017-1-Biewenga-t01-klein

Table 1. Transaction driven vs. Table driven: characteristics and requirements for data migration. [Click on the image for a larger image]

Data Migration as an enabler for organizational growth

Designing and performing a solid data migration process is state of art and requires highly skilled personnel and mature processes at a minimum. Recognizing the importance and complexity of data migration and acting accordingly is a must for executives who want to be in ‘control’ of their professional company. It is important that data migration should not be seen as solely an IT function but also (and more heavily) a business requirement that has a significant impact on a company’s daily business.

A good question to ask yourself is when the data migration will be considered as successful? The ambition level for the data migration is defined and designed at the planning phase.

Data migration will benefit the organization in various ways:

  • Data Modeling: identification and mapping of data relationships (referential integrity) as a result of the data landscape exposure.
  • Data Governance: appointment of data owners and associated roles and responsibilities.
  • Efficiency: less idle search time and a lower data footprint as a result of the (partly) renewed IT architecture.

With regard to the transaction driven and table driven approaches for data migration that are outlined in this article, there is no ‘best option’ as the two approaches differ significantly in complexity and cost. Organization specific needs and complexity are key in determining which data migration approach to pursue. In all cases, data migration can be considered as an enabler for growth with a positive impact on organizational efficiency, data governance and regulatory compliance.