The driver for data migration is often based on the business decision to modify the ERP landscape. Apart from the necessity and the complexity involved, data migration is a value adding activity for your organization. It is pivotal to take the time to consider the factors that play a role in determining the most suitable migration approach. Data migration costs, system performance, outage duration, data volume, data quality, data retention and the enablement of new system functionality are all factors to take into consideration when choosing your migration approach. This article describes two alternative approaches and subsequently, how these approaches cope with the described factors.
Introduction to data migration
Data migration is a systematic and phased approach for transferring data between (multiple) systems based on a specific migration methodology. The driver for data migration is often based on the business decision to modify the ERP landscape.
Data migration is a value adding activity for the organization. Moreover, an end to end data migration exposes the full data landscape of an organization in an unique way. Going through the migration process as such can reveal many unknowns in organizational data and data relationships and enables mapping of the full organizational data landscape. In addition, data migration supports the identification of data owners and can be an improvement or a solid start for setting up data governance within your organization.
It is worth mentioning that data migrations are always complex in their nature. It is of critical importance to design internal controls that provide assurance regarding completeness and accuracy of the data selected for migration. A solid and extensive data reconciliation framework including checks and balances is of crucial importance in order to verify the completeness and accuracy of the migrated data. Based on our experience, data migration consumes on average 15-25% of the project implementation budget.
With regard to the ERP data migration we have identified two approaches based on our industry experience. These two approaches will be explained in detail within this article. Both approaches have their own specific complexity and challenges.
The transaction driven approach is the most common approach for an ERP migration. In this approach data is extracted from the legacy system and converted into a load file which in turn can be loaded with the use of standard load programs provided by the ERP system. Often the scope of data migration is limited to active data and is therefore more cost effective when compared with the table driven approach.
The table driven approach is less commonly used as this entails the full transfer of tables without any selection criteria. Based on our experience for a multinational client, we have an interesting real-time case where the table driven approach is implemented successfully in an SAP ERP environment.
Respectively, the transaction and table driven approach will be described in the following paragraphs by explaining the two approaches and their inherent benefits and challenges.
Transaction driven migration approach
In the transaction driven approach the migrated dataset is often limited to active data. The active dataset is extracted from the legacy system and converted into load files. These load files can be uploaded with use of standard load programs or API’s provided by the ERP system. By making use of the standard load programs, transactions are processed in the system as if they are new transactions. This approach leverages functional consistency and completeness checks embedded in the ERP system/load programs. In addition, any error messages from the data load provide useful information both on transactional and associated master data. These error or warnings messages from the loading programs provide insight into the:
- changes required to the conversion logic;
- enrichment that needs to take place;
- required cleansing activities.
As a consequence of the transaction driven approach, a data warehouse must serve as a data repository for historic data instead of the new ERP system. This approach also makes it necessary for the customer to maintain the legacy system in read-only mode after go-live in order to comply with local legislation on data retention timelines. However, we see that the need for maintaining the legacy system after go-live, decreases rapidly over time. After one year, the average system usage is generally restricted to one or two users for ad-hoc system requests.
As mentioned earlier, when clients are implementing a new ERP system, we advise limiting their data migration only to active data. In this article, we explain the rationale for limiting the migration dataset to active data and the definition of active data.
But what is active data? Active data consists of active transactional data, active master data and general ledger balances. An early data migration scoping is essential for a successful ERP implementation. This scoping exercise starts with defining your active transactional data. Active transactional data consists of financial transactions which are not yet complete at the time of migration or could lead to financial transactions in the future.
It is important to agree on a timeline when defining the future transactions and to set this in stone. For example, future transactions that will happen within six months after go-live. Active master data is derived from the scope of the active transactional data. Hence, master data required to process active transactional data defines active master data.
Although some clients prefer to also migrate historic data, there are clear arguments on why clients should do otherwise. The most prominent reasons to not migrate historic data are related to the effect of higher data volumes in the migration on the performance and stability of the system, the system outage duration and associated higher cost of the migration.
Data migration costs consume a large portion of the implementation budget (15-25% on average). This is caused by the necessity of multiple data migration cycles. These cycles are not restricted to exercise the data migration. It is also aimed for support, integration and user acceptance and regression testing.
Additional cost drivers for data migration are the setup of data migration logic, conversion logic and reconciliation scripts. This setup is required to assure the completeness and accuracy of the migrated dataset. This setup covers:
- extracting data from the legacy system to a staging environment;
- converting and enriching data to the structure required to load into the new ERP system;
- reconciling the end-result based on completeness and accuracy of data.
Figure 1. Transaction driven data migration approach.
The complexity of the data migration significantly increases if historic data is also part of the data migration scope. This is due to the fact that organizational data structures are subject to change over time. Examples are changes in cost center structures, employee positions, VAT and the Chart of Accounts. This adds a lot of complexity to the data migration and reconciliation logic required. Especially as the new ERP system is likely to have a different data structure than the legacy system.
Along with the earlier described higher implementation costs by including historic data, we also see that data from legacy systems are blocking new ways of working in the new ERP system. This is caused by the fact that functionalities are different or are used differently in the new system. Hence, less data is an enabler for functional improvements.
With regard to data quality, the new ERP system requires proper data quality before any migration activity. This requires an intensive data cleansing process and in some cases housekeeping activities (examples: clearing of open items and/or enriching master data in the legacy system). The inherent advantage of limiting the migrated data to active data is that the scope of cleansing activities is limited to the boundaries of the selected dataset for migration.
The last argument for limiting the migration scope is related to the production system outage duration. The production outage at cutover, is likely to increase if higher data volumes are considered. An increase in production system outage duration can drastically increase costs due to, for example, higher idle time of machines in factories.
Table driven migration approach
The alternative to a transaction driven migration approach is a table driven data migration approach. In this approach all the tables in the legacy system are migrated and mapped one-to-one to the tables of the new ERP system. However, this approach is restricted to migrations within the same ERP vendors, as table structures need to be aligned between the source and target system. These table structures and their referential integrity are key complexity factors inherent to the table driven approach. In the table driven approach, historic data residing in the legacy system is migrated completely (with no selection criteria) to the new ERP system.
As mentioned above, the table driven approach has its challenges and often results in significantly higher data migration costs.
One of our clients performed a successful table driven data migration. The client used cutting edge SAP technologies to overcome the challenges of the table driven approach. Our client is a multinational, one of the biggest SAP customers, and has one of the largest transaction volumes in the world. In close cooperation with SAP, our client performed a complete migration from SAP R/3 to SAP Business Suite on HANA. This was followed by an upgrade to S/4HANA Finance. We provided our client with quality assurance (advice: attention and improvement points) during the complete transformation.
To overcome the challenge of high data volumes, data archiving activities were intensified and data retention times per operating country were revised at the client side. This drastically decreased the scope of data that needed to be migrated.
A table driven data migration approach also requires different types of checks on the completeness and accuracy of data. Consistency checks need to be built into the migration tooling, to assess data consistency between tables and data objects before loading. As data was not validated via ERP embedded functionality, measures needed to be taken to overcome potential data inconsistency. All of these measures create the necessity for a data reconciliation framework that includes checks and balances to verify the completeness and accuracy of the migrated data.
The upgrade program of S/4HANA Finance performs built-in consistency checks between data elements that are embedded in the S/4 HANA Finance program. This resulted in mandatory extensive housekeeping and cleansing activities over a six months period covering many years of historic data.
To manage an acceptable outage duration whilst migrating the full table scope, a more advanced technology was required for our client. A technique that can be used to divide the data volumes in several migration loads is the Near Zero Down Time (NZDT) technique developed by SAP. NZDT places triggers (points in time that record the changes in tables after the trigger setting) on SAP tables which make it possible to divide the full data loads into several increments. This technique allows the start of the data migration when the legacy system is still in production.
Due to the intensity of this data migration effort and to not complicate the data migration further, no significant functional improvements were made at the client during the migration project itself. The client has initiated an additional improvement program (which the migration to the S/4 HANA platform is a pre-requisite of) to get the full benefits from their new ERP system. This additional 3-year program will facilitate the functional improvements which are ready for implementation on the new platform. A full redesign of the SAP business processes (end-to-end) is an example of these functional improvements at our client that will be performed after migration to the S/4 HANA platform.
It is certainly worth mentioning that the migration described above is extremely complex in origin and unique in nature although it is a migration between platforms of the same ERP vendor (SAP).
This approach requires a full description of each table and its associated migration and transformation logic. This was documented in the data migration blueprint. This requires advanced ‘in-house’ knowledge to develop a data migration blueprint.
As mentioned earlier, due to the high data volume, the data reconciliation process and the effort to verify the completeness of the migrated data was also higher, when compared with the transactional approach. Inherent to the reconciliation approach, an incident handling method is absolutely necessary to derive information and value from errors occurring during the migration process.
The incident management process at our client helped to categorize, address and resolve the test incidents that resulted from either the migration or the reconciliation. Both the migration and reconciliation at this client were performed using customized SAP tooling.
Comparison of the data migration approaches
As mentioned data migration can add value to the organization. The value of data migration can be categorized and explained on high-level in relation with the below pillars:
- Accuracy: data is in the system(s) of choice within the organization. Increased data integrity results in less data redundancy and less appearance of bad source data thereby making data within the organization more accurate.
- Completeness: transactions in the organizational database are more enriched as the data within the organization is more categorized in system(s) and maybe even centralized.
- Compliance: data migration transfers data and/or merges data in systems which ultimately has a positive effect on data integrity. However, it is crucial to be compliant with tax, local and statutory requirements regarding data retention.
- Efficiency: data migration results in lower operational cost as a result of the renewed IT infrastructure. Increased data integrity can result in a decline in searching costs of individuals and an increase in organizational agility.
The two data migration approaches are compared based on these pillars. For each pillar specific characteristics and requirements come into play for each migration approach. The table below provides an overview of these requirements and characteristics.
Table 1. Transaction driven vs. Table driven: characteristics and requirements for data migration. [Click on the image for a larger image]
Data Migration as an enabler for organizational growth
Designing and performing a solid data migration process is state of art and requires highly skilled personnel and mature processes at a minimum. Recognizing the importance and complexity of data migration and acting accordingly is a must for executives who want to be in ‘control’ of their professional company. It is important that data migration should not be seen as solely an IT function but also (and more heavily) a business requirement that has a significant impact on a company’s daily business.
A good question to ask yourself is when the data migration will be considered as successful? The ambition level for the data migration is defined and designed at the planning phase.
Data migration will benefit the organization in various ways:
- Data Modeling: identification and mapping of data relationships (referential integrity) as a result of the data landscape exposure.
- Data Governance: appointment of data owners and associated roles and responsibilities.
- Efficiency: less idle search time and a lower data footprint as a result of the (partly) renewed IT architecture.
With regard to the transaction driven and table driven approaches for data migration that are outlined in this article, there is no ‘best option’ as the two approaches differ significantly in complexity and cost. Organization specific needs and complexity are key in determining which data migration approach to pursue. In all cases, data migration can be considered as an enabler for growth with a positive impact on organizational efficiency, data governance and regulatory compliance.