Recently I interviewed a consulting manager who had just delivered a very large ERP data migration. What was his biggest challenge? It was getting the business users engaged early in the project. And this is a very familiar story. I have been encountering this same challenge for years in delivering application data migrations. And it cuts both ways, either the business users are unwilling to spend the time, or the IT team does not recognize the need. In this case the team was delayed significantly (the estimate was that a project that should have taken four months lasted nearly seven), as data issues reared their ugly heads at the end of the development cycle, during the first mock load of the data. Lack of collaboration? It has impacts. In the recent white paper published by Informatica, The Five Pitfalls of Data Migration, one of the pitfalls outlined was a lack of collaboration between the business users (or data experts) and the data migration technical team.
What do I mean by business/IT collaboration? Really I am talking about baking data stewardship into the process of application data migration. And the tools to support that process.
Data migration teams often fail to set up the appropriate level of data stewardship, or collaboration, with the business, for a variety of reasons. And this has a far greater impact on the success of an application data migration (and the associated modernization or post-acquisition data integration program) than any other single factor. When business data experts are not involved, then the technical team is often left no choice but to make decisions about what data should move forward, how it should be mapped, or what is ‘good enough’ for the target systems. These are rightfully business decisions. The net result is that there are either delays when the business is brought in to the project to provide guidance on data issues as ‘one offs’, or worse, delays when the wrong data is pulled forward. In the case of my friend the consultant, the data was deemed wrong during the first phase of testing the load and his team was faced with re-writing mappings and business rules to support requirements that could have been identified early in the project.
So what drives this problem and what can we do to resolve it? I see two trends (and I invite you to comment on what you might have experienced). One is that the team as a whole, business and IT, don’t realize the complexity of the data migration challenge, that as part of an upgrade or consolidation you are likely transforming (and that means changing) the data that drives your business. This can be difficult and is always a risk filled proposition. The second common challenge that I encounter is the belief by the team that until we start ‘coding’ or actually moving data, no real work is getting done. There is an unfortunate rush to moving data, before even validating that we have the correct source, what duplicates may exist, or that the mapping reflects the actual state of the source data.
I stepped into project recently to provide some data analysis and discovered that 20% of the master data was missing, MISSING! The developers (and luckily we were still in development) were pulling data from only one of two needed tables. They had not done basic profiling to validate that the source to target mapping provided by the business was correct. However they were well on their way to completing their initial data load. By applying some simple profiling I was able to discover the error and save the team weeks of re-work that would have been needed later in the project. The error would have been caught had the team included an initial data validation and review, combined with an active data stewardship team.
So here are my four recommendations for any data migration, three of which a team should accomplish before development of data migration transformations.
- Validate that you have the correct source using some high level master data analysis. For example, can you prove that you are pulling all the customers? Are there duplicates?
- Spend some time profiling the data on the source and share that information with the business team before you start source to target field level mapping. It is likely that they will be surprised by the findings. And it will improve everyone’s ability to map the data correctly. For example a business user may not realize that there are 50,000 inactive customers in the database, from a previous migration or a failed archive process. Or that the code set that is believed to have five valid values actually has 15.
- When the source to target mapping is complete, run some targeted data profiling to ensure that the assumptions are correct. For example, are the fields fully populated? Can the source data support the referential integrity of the target?
- Finally, continue profiling and evaluating the quality of the data throughout the migration process, and sharing this information with the business. For example, are you expecting to see this many duplicates?
All of these practices require a good working relationship, and a commitment, from the entire team, business and IT, data stewards and developers. I see systems integrators and IT shops approaching this in various ways (and often looking to improve on the practice with each subsequent migration). Some are taking a more informal approach of having the data experts assigned and co-located with the technical team. Other organizations, especially those that are geographically dispersed, are creating very specific project and cutover plans that define, step by step, what the collaboration (or data stewardship) tasks are and who is responsible.
For example, a global manufacturing company has embarked on a modernization program where they are de-commissioning hundreds of legacy applications. For each application data migration they start with a template project plan, identify the data stewards and data experts and gain commitment from them for mapping, data quality review and validation. The team 1) can identify exactly for what and how long the data expert must commit, and 2) will not move forward on the project without the commitment.
These changes in approach are driving how we use software as well. For example, the need for the business to ensure regulatory compliance via data governance processes can be enabled using effective metadata management and self-documenting mappings. For the needed cleansing and validate of migrated data, the data migration team can provide business users with data quality dashboards and data profiling results (or access to data profiling tools). The creation and sharing of these assets is becoming increasingly easier over time.
So what are you seeing out there for business/IT collaboration on application data migrations? Would love to hear from other practitioners.