You’ve probably been hearing about data migration as a buzzword for a while now. But in the meanwhile, you couldn’t get to a place where you understand this process deeply, right? Don’t worry we are going to help you.
In short, data migration is the process of transferring data between different file formats, databases, or storage systems. It seems quite easy, right?
However, the task of transferring information is full of details and procedures that you should perform before, during, and after the whole migration. Some challenges may arise, and you must prepare yourself to overcome them easily. This article will help you identify these challenges, clarifying the main topics about data migration, and provide guidelines that will lead you to success, as you may check in the index below:
- What is data migration?
- Why migrate your data?
- Strategy and planning
- Know your data – types of data
- Main challenges in data migration
- The connection between data migration and the main software project
- Working environments
- Assess the quality of your data migration project
- Final tips: get everyone ready to migrate data!
1. What is data migration?
It is the process of using a database migration system to transfer information from source database server(s) to one or multiple new target servers. When the process finishes, the new database(s) store(s) all data. However, the data structure can be quite different.
All clients that accessed the original data source are also transferred to the new one and the source is disabled.
The data migration process is simply displayed in the diagram below:
So, let’s suppose you run a functional business. You’re working with an outdated system, that has some offline modules or some online software that is not answering your needs right now. After doing a business analysis, you decided to update this system for greater efficiency and security. Then, you will develop all this process in a new environment, ensuring the same features and most probably adding some new ones.
After researching the technology, planning, strategy, defining a methodology, and analysing and structuring all data, you will start to analyse data transfer.
Each specific type of data migration project is singular, and its traits are diverse, according to the used technology and the data itself. But no matter what the scenario is, if you already decided to transfer your company’s data to a new system that is more capable, safe, and updated, including cloud solutions, it is crucial to implement the best practices possible to avoid information loss.
2. Why migrate your data?
The reasons to migrate data are many – the most common one is the system’s inability to meet the business needs, either for low processing power or for the unmapped needs during the system implementation. In addition, you should always have in mind that technology should not restrain your business growth.
Another common reason to migrate data is tech lag. It happens when the original system no longer offers a safe solution and has some performance issues. Also, during a merge or an acquisition process, most certainly IT systems will be merged, and it will be required to migrate all information into one single system.
Digital transformation is also a reason, e.g.: when the company is focused on innovative improvements for employees, their operation, or the business, so as to guarantees a competitive advantage in the market.
3. Strategy and planning in data migration
Before the migration process starts, the first step is to understand its needs and goals, e.g.: some of the reasons we mentioned above. Then we start mapping these goals. At the same time, we identify which data will be migrated, where to, and how it will be structured. Data migration is a total tailor-made process.
When you are planning your data migration process you should follow a checklist. Don’t miss any detail, even those that are collected and monitored at the moment. Find below a checklist that might help you on planning this task:
- Define your data migration scope
- Identify potential risks and issues
- Define how the process will run
- Clarify which methodologies and technologies to use
- Get support from business units to identify gaps and misunderstandings
- Identify tools to support the process (as Salesforce tools, for example)
4. Know your data – types of data
One of the biggest challenges is to identify which types of data your company has.
Some of your business data within your old server might refer to finished processes and this information will take no change – this will be your historical data. This kind of data will gather all information until today, for example, selling records or invoicing, that will take a spot at the new server. This data makes you understand more of your business, provides analyse capabilities, like finding patterns and creating forecasts. In some particular situations, parts of this information will no longer make sense after the migration. You can erase them first.
When we are talking about running proceedings that are not finished yet, we are talking about operational data. This information must exist within the new system so your work can continue, and eventually become concluded.
At last, reference data – the information that typifies the business. For instance, an app that supports a car workshop business will provide the vehicle’s brands as reference data, as well as each model or the type of intervention or fixing. If you need to cluster and categorize business data, you use reference data.
5. Main challenges in data migration
1. About data
Most data servers are relational (SQL) and that popularity came from the friendly and simple way to display information in spreadsheets, and the easiness to establish data relationships.
Even though the type of database from the source server and the target server is usually similar, the model itself can be distinct – displaying a different structure to store all information. This requires that you not only transfer data but also transform the information to fit the new model.
Data quality – general fields
Another thing to have in mind is data quality. Evaluating data might seem hard, but it is a mandatory task to do. You might even need to improve this information before inserting it into the new server. Let’s imagine that you have a database field to store e-mail addresses but it wasn’t properly filled and now, in a significant number of records, it provides telephone numbers instead. And that in the new server, there is a validation mechanism to ensure that this field always contains a “@”. So, inserting a telephone number in this field will generate an error. Thus, you have either to improve data quality or to deactivate that restriction during migration – any other way will make you lose information.
Data quality – reference data
Concerning the new system, it could lack some reference data. Let us get back to the car workshop example to understand this issue better. In the new system, suppose that you need information about each car’s brand and model. However, the reference data doesn’t consider the old models. If the source server has cars that refer to those brands or models, their integration will fail.
Data quality – status fields
Sometimes, an error could happen by not existing direct correspondence between status (an attribute) in the source or target system. Suppose that each car may have the following status within the old system: “service appointment proposed”, “service appointment accepted” or “service declined”. However, the status in the new system may have one more value – “service appointment needed next month and yet not proposed”. For this, it will be necessary to calculate, for each of the cars in the source system, if it should not be in this new status instead of its current status.
Non-existing data – new business concepts
Even certain kind of business concepts may be new and does not exist within the old system. For instance, the car workshop software can demand that every car service has an owner (someone that takes accountability for that service). If this information is not stored in the old system, it has to be calculated before migration. You can assume that the employee who received the car is its owner, assuming this information exists in the system. The option to manually introduce all this information is not practicable, assuming the huge number of car services in the old system.
Data quality improvement – when and where
We recommend that any data change or improvement occurs within the source server. This assures all right information is available for users in the new system right after the migration progress. Also, doing massive data fixes in the production environment will certainly interfere with data changes already made in the new system by end-users.
2. The modular and iterative nature of a data migration process
Even though you may still lack all the information about architecture and fields, you should start your data migration design and implementation as soon as possible. Otherwise, you would be working in a conceptual approach, that could delay the process and migration data testing to an unacceptable extent.
The process of data migration should be divided into modules that, more or less, match the functional ones. Also, data related to the interconnections between those modules must be identified – the interconnecting submodules. Consequently, the migration process can be prioritized by modules and then interconnecting submodules. These can be launched when the work done within the main modules is enough to allow it.
In each module, you should invest first in migrating data that assures the relationships between tables are right as well as records uniqueness (External_id from the source system) – what we call the migration skeleton. Then, you should proceed with the remaining data to the extent of your knowledge. Then, invest time in the other modules, coming back again to the initial ones when you have more knowledge. Thus, adding more data to this process, fixing and refining the modules previously added to the migration process, makes this an iterative process.
Of course, you should have access to a development or test environment to exercise the interactive migration process.
3. Communication and decision
Source system understanding
When you decide to migrate data between two different systems, you need to know not only about your data and what it means in the system but also what it means for your business. For this, identify someone who knows the overall business as well all the relationships to the source data model – schedule a weekly meeting with this key-person to support the consulting team during all migration project.
Target system understanding
Also, it’s important that a migration team member participates on the project business meetings to discuss features and to acquire knowledge about the new solution, the destination data model, and its relationship with the business. Will be key to be aligned and involved on the business discussion to build the new data model.
An important contribution of the migration team is to help to ensure data integrity in the new system. E.g. avoiding replication of the same information in different objects, or the replication of relationships information.
Migration issues clarification
Finally, it is mandatory to set a weekly moment to answer questions, discuss concerns and decision-making. For the questions to be exposed, it’s crucial to motivate and promote communication with appealing diagrams – that will explain information flows between the systems and potential data transformations.
As an example, let’s get back to the car workshop project in regards to discounts information:
Each kind of car revision can be performed in several workshops – headquarters or not. The revision planning has an established date. Also, it demands to pick a revision workshop, among those capable of performing that intervention, headquarters or not. Having this information, will generate a revision plan to store information about the particular intervention.
But, in the source system, discounts were connected with “revision plans”. However, at the target system, they connect to the planning workshop (which is the headquarters). So it’s important to ensure that any discount information to be migrated has a planning headquarters workshop. This planning workshop should correspond to the revision headquarters workshop – even if it was not selected for the planning.
Overall idea about communication
In short, to succeed in a data migration project it is important to let information flow. Of course, it is relevant to have skills to investigate and identify relevant questions, querying the database by oneself, however, context-based communication between stakeholders allows the process to thrive and helps everyone speaking the same, identifying challenges, and solving problems together.
So, we are always working through a collaborative methodology.
6. The connection between data migration and the main software project
There are always common interests between the data migration project and the main software one. Both projects will benefit if a key person from the data migration team helps in the development of the new system data model. This avoids issues like developing complex and harder software in regards to data maintenance or a wider migration process. Let’s see some examples:
1. Replicated information
When data replication occurs in different objects and tables from the same database, it requires an extra effort to develop and test the mechanisms that will ensure that this information is the same wherever it stores, in spite of any changes.
From the migration point of view, the outcome is more data to transfer. Check the example with replicated fields between “revision planning” and “integrated revision planning” tables:
If we have Year and Month in the integrated table, we don’t need any revision planning that refers to it.
2. Duplicated hierarchical relationship
Considering a son-father relationship and a father-grandfather relationship, it would be redundant to consider an additional son-grandfather relationship. Let’s get back to the car workshop example to better understand this. If the data model supports information about the parts dealer for each brand/model, as well as the origin country of each dealer, it will not be necessary to define directly the dealer’s country for each car brand/model – the dashed line in the figure below.
Otherwise, this would require developing and testing additional proceedings to ensure the integrity of changed data (the country of a brand/model parts dealer is the same, whether you consider the dashed line or the other two). It would also require an extra effort in the migration project to transfer even more information – the dashed line data.
7. Working environments
To support the migration process development and testing, the support team must have access to copies of the source/target databases/systems, to work with them as a development and test environment. This will allow the development and testing process to be safe, preventing any data damage in the production environment. In a data migration process, it is the only way to cope with its recursive nature.
8. Assess the quality of your data migration project
As with any software project, data migration will generate errors too. Therefore, you must schedule time and energy from someone or a quality team to perform tests in the migrated data.
Basically, relevant information and its relationships in the source system shall exist in the target system. Of course, considering transformations required and made in the migration process.
It is also important to have access to a quality system for recording planned and already made tests, as well as detected bugs. Also, you need to configure the system for this particular use so you can further migrate data. This will allow defect typification coherence with data testing, among others, e.g. “data not found in the target system”, “data has no the right value – truncation” or “data has not the right format – decimal digits”. So, subsequent analyses of the testing activity and decision-making support, based on test coverage and open defects (not yet solved), are possible to be performed.
Stellaxius built a Salesforce app for tracking quality assessment.
It’s also important to state that defect identification and fixing during a development/test phase, in its respective environments, costs and impacts much less than in a production environment, after the system’s going live.
So, we recommend you invest properly in Quality&Assurance (Q&A).
9. Final tips: get everyone ready to migrate data!
Planning a software project is not only a technical challenge but also a company’s cultural issue. You must get everyone ready for change, avoiding its resistance when adapting proceedings and moving to the new technology.
It’s also important to provide internal support for everyone to understand the value of this innovation, and how it will benefit each one’s career by learning and adapting to new technologies. Consequently, our methodology at Stellaxius focuses on training and drive awareness for all the employees to the new system and proceeding changes advantages.
Are you ready to start your digital transformation?
Contact us.
SUBSCRIBE KNOWLEDGE CENTER
Subscribe for free to Knowledge Center's monthly digest to receive the hottest news and newest posts directly on your Inbox.