Category Archives: Data Migration
The post is by Philip Howard, Research Director, Bloor Research.
One of the standard metrics used to support buying decisions for enterprise software is total cost of ownership. Typically, the other major metric is functionality. However functionality is ephemeral. Not only does it evolve with every new release but while particular features may be relevant to today’s project there is no guarantee that those same features will be applicable to tomorrow’s needs. A broader metric than functionality is capability: how suitable is this product for a range of different project scenarios and will it support both simple and complex environments?
Earlier this year Bloor Research published some research into the data integration market, which exactly investigated these issues: how often were tools reused, how many targets and sources were involved, for what sort of projects were products deemed suitable? And then we compared these with total cost of ownership figures that we also captured in our survey. I will be discussing the results of our research live with Kristin Kokie, who is the interim CIO of Informatica, on Guy Fawkes’ day (November 5th). I don’t promise anything explosive but it should be interesting and I hope you can join us. The discussions will be vendor neutral (mostly: I expect that Kristin has a degree of bias).
To Register for the Webinar, click Here.
Western Union, a multi-billion dollar global financial services and communications company, data is recognized as their core asset. Like many other financial services firms, Western Union thrives on data for both harvesting new business opportunities and managing its internal operations. And like many other enterprises, Western Union isn’t just ingesting data from relational data sources. They are mining a number of new information-rich sources like clickstream data and log data. With Western Union’s scale and speed demands, the data pipeline just has to work so they can optimize customer experience across multiple channels (e.g. retail, online, mobile, etc.) to grow the business.
Let’s level set on how important scale and speed is to Western Union. Western Union processes more than 29 financial transactions every second. Analytical performance simply can’t be the bottleneck for extracting insights from this blazing velocity of data. So to maximize the performance of their data warehouse appliance, Western Union offloaded data quality and data integration workloads onto a Cloudera Hadoop cluster. Using the Informatica Big Data Edition, Western Union capitalized on the performance and scalability of Hadoop while unleashing the productivity of their Informatica developers.
Informatica Big Data Edition enables data driven organizations to profile, parse, transform, and cleanse data on Hadoop with a simple visual development environment, prebuilt transformations, and reusable business rules. So instead of hand coding one-off scripts, developers can easily create mappings without worrying about the underlying execution platform. Raw data can be easily loaded into Hadoop using Informatica Data Replication and Informatica’s suite of PowerExchange connectors. After the data is prepared, it can be loaded into a data warehouse appliance for supporting high performance analysis. It’s a win-win solution for both data managers and data consumers. Using Hadoop and Informatica, the right workloads are processed by the right platforms so that the right people get the right data at the right time.
Using Informatica’s Big Data solutions, Western Union is transforming the economics of data delivery, enabling data consumers to create safer and more personalized experiences for Western Union’s customers. Learn how the Informatica Big Data Edition can help put Hadoop to work for you. And download a free trial to get started today!
A mid-sized insurer recently approached our team for help. They wanted to understand how they fell short in making their case to their executives. Specifically, they proposed that fixing their customer data was key to supporting the executive team’s highly aggressive 3-year growth plan. (This plan was 3x today’s revenue). Given this core organizational mission – aside from being a warm and fuzzy place to work supporting its local community – the slam dunk solution to help here is simple. Just reducing the data migration effort around the next acquisition or avoiding the ritual annual, one-off data clean-up project already pays for any tool set enhancing data acquisitions, integration and hygiene. Will it get you to 3x today’s revenue? It probably won’t. What will help are the following:
Hard cost avoidance via software maintenance or consulting elimination is the easy part of the exercise. That is why CFOs love it and focus so much on it. It is easy to grasp and immediate (aka next quarter).
Soft cost reduction, like staff redundancies are a bit harder. Despite them being viable, in my experience very few decision makers want work on a business case to lay off staff. My team had one so far. They look at these savings as freed up capacity, which can be re-deployed more productively. Productivity is also a bit harder to quantify as you typically have to understand how data travels and gets worked on between departments.
However, revenue effects are even harder and esoteric to many people as they include projections. They are often considered “soft” benefits, although they outweigh the other areas by 2-3 times in terms of impact. Ultimately, every organization runs their strategy based on projections (see the insurer in my first paragraph).
The hardest to quantify is risk. Not only is it based on projections – often from a third party (Moody’s, TransUnion, etc.) – but few people understand it. More often, clients don’t even accept you investigating this area if you don’t have an advanced degree in insurance math. Nevertheless, risk can generate extra “soft” cost avoidance (beefing up reserve account balance creating opportunity cost) but also revenue (realizing a risk premium previously ignored). Often risk profiles change due to relationships, which can be links to new “horizontal” information (transactional attributes) or vertical (hierarchical) from parent-child relationships of an entity and the parent’s or children’s transactions.
Given the above, my initial advice to the insurer would be to look at the heartache of their last acquisition, use a benchmark for IT productivity from improved data management capabilities (typically 20-26% – Yankee Group) and there you go. This is just the IT side so consider increasing the upper range by 1.4x (Harvard Business School) as every attribute change (last mobile view date) requires additional meetings on a manager, director and VP level. These people’s time gets increasingly more expensive. You could also use Aberdeen’s benchmark of 13hrs per average master data attribute fix instead.
You can also look at productivity areas, which are typically overly measured. Let’s assume a call center rep spends 20% of the average call time of 12 minutes (depending on the call type – account or bill inquiry, dispute, etc.) understanding
- Who the customer is
- What he bought online and in-store
- If he tried to resolve his issue on the website or store
- How he uses equipment
- What he cares about
- If he prefers call backs, SMS or email confirmations
- His response rate to offers
- His/her value to the company
If he spends these 20% of every call stringing together insights from five applications and twelve screens instead of one frame in seconds, which is the same information in every application he touches, you just freed up 20% worth of his hourly compensation.
Then look at the software, hardware, maintenance and ongoing management of the likely customer record sources (pick the worst and best quality one based on your current understanding), which will end up in a centrally governed instance. Per DAMA, every duplicate record will cost you between $0.45 (party) and $0.85 (product) per transaction (edit touch). At the very least each record will be touched once a year (likely 3-5 times), so multiply your duplicated record count by that and you have your savings from just de-duplication. You can also use Aberdeen’s benchmark of 71 serious errors per 1,000 records, meaning the chance of transactional failure and required effort (% of one or more FTE’s daily workday) to fix is high. If this does not work for you, run a data profile with one of the many tools out there.
If standardization of records (zip codes, billing codes, currency, etc.) is the problem, ask your business partner how many customer contacts (calls, mailing, emails, orders, invoices or account statements) fail outright and/or require validation because of these attributes. Once again, if you apply the productivity gains mentioned earlier, there are you savings. If you look at the number of orders that get delayed in form of payment or revenue recognition and the average order amount by a week or a month, you were just able to quantify how much profit (multiply by operating margin) you would be able to pull into the current financial year from the next one.
The same is true for speeding up the introduction or a new product or a change to it generating profits earlier. Note that looking at the time value of funds realized earlier is too small in most instances especially in the current interest environment.
If emails bounce back or snail mail gets returned (no such address, no such name at this address, no such domain, no such user at this domain), e(mail) verification tools can help reduce the bounces. If every mail piece (forget email due to the miniscule cost) costs $1.25 – and this will vary by type of mailing (catalog, promotion post card, statement letter), incorrect or incomplete records are wasted cost. If you can, use fully loaded print cost incl. 3rd party data prep and returns handling. You will never capture all cost inputs but take a conservative stab.
If it was an offer, reduced bounces should also improve your response rate (also true for email now). Prospect mail response rates are typically around 1.2% (Direct Marketing Association), whereas phone response rates are around 8.2%. If you know that your current response rate is half that (for argument sake) and you send out 100,000 emails of which 1.3% (Silverpop) have customer data issues, then fixing 81-93% of them (our experience) will drop the bounce rate to under 0.3% meaning more emails will arrive/be relevant. This in turn multiplied by a standard conversion rate (MarketingSherpa) of 3% (industry and channel specific) and average order (your data) multiplied by operating margin gets you a benefit value for revenue.
If product data and inventory carrying cost or supplier spend are your issue, find out how many supplier shipments you receive every month, the average cost of a part (or cost range), apply the Aberdeen master data failure rate (71 in 1,000) to use cases around lack of or incorrect supersession or alternate part data, to assess the value of a single shipment’s overspend. You can also just use the ending inventory amount from the 10-k report and apply 3-10% improvement (Aberdeen) in a top-down approach. Alternatively, apply 3.2-4.9% to your annual supplier spend (KPMG).
You could also investigate the expediting or return cost of shipments in a period due to incorrectly aggregated customer forecasts, wrong or incomplete product information or wrong shipment instructions in a product or location profile. Apply Aberdeen’s 5% improvement rate and there you go.
Consider that a North American utility told us that just fixing their 200 Tier1 suppliers’ product information achieved an increase in discounts from $14 to $120 million. They also found that fixing one basic out of sixty attributes in one part category saves them over $200,000 annually.
So what ROI percentages would you find tolerable or justifiable for, say an EDW project, a CRM project, a new claims system, etc.? What would the annual savings or new revenue be that you were comfortable with? What was the craziest improvement you have seen coming to fruition, which nobody expected?
Next time, I will add some more “use cases” to the list and look at some philosophical implications of averages.
Ah yes, the Old Mainframe. It just won’t go away. Which means there is still valuable data sitting in it. And that leads to a question that I have been asked about repeatedly in the past few weeks, about why an organization should use a tool like Informatica PowerExchange to extract data from a mainframe when you can also do it with a script that extracts the data as a flat file.
So below, thanks to Phil Line, Informatica’s Product Manager for Mainframe connectivity, are the top ten reasons to use PowerExchange over hand coding a flat file extraction.
1) Data will be “fresh” as of the time the data is needed – not already old based on when the extraction was run.
2) Any data extracted directly from files will be as the file held it, any additional processes needed to run in order to extract/transfer data to LUW could potentially alter the original formats.
3) The consuming application can get the data when it needs it; there wouldn’t be any scheduling issues between creating the extract file and then being able to use it.
4) There is less work to do if PowerExchange reads the data directly from the mainframe, data type processing as well as potential code page issues are all handled by PowerExchange.
5) Unlike any files created with ftp type processes, where problems could cut short the expected data transfer, PowerExchange/PowerCenter provide log messages so as to ensure that all data has been processed.
6) The consumer has the capacity only to select the data that is needed for the consumer application, use of filtering can reduce the amount of data being transferred as well as any potential security aspects.
7) Any data access of mainframe based data can be secured according to the security tools in place on the mainframe; PowerExchange is fully compliant to RACF, ACF2 & Top-Secret security products.
8) Using Informatica’s PowerExchange, along with Informatica consuming tools (PowerCenter, Mercury etc.) provides a much simpler and cleaner architecture. The simpler the architecture the easier it is to find problems as well as audit the processes that are touching the data.
9) PowerExchange generally can help avoid the normal bottlenecks associated to getting data off of the mainframe, programmers are not needed to create the extract processes, new schedules don’t need to be created to ensure that the extracts run, in the event of changes being necessary they can be controlled by the Business group consuming the data.
10) Helps control mainframe data extraction processes that are still being run but from which no one uses the generated data as the original system that requested the data has now become obsolete.
This creative thinking to solve a problem came from a request to build a soldier knife from the Swiss Army. In the end, the solution was all about getting the right tool for the right job in the right place. In many cases soldiers didn’t need industrial strength tools, all they really needed was a compact and lightweight tool to get the job at hand done quickly.
Putting this into perspective with today’s world of Data Integration, using enterprise-class data integration tools for the smaller data integration project is over kill and typically out of reach for the smaller organization. However, these smaller data integration projects are just as important as those larger enterprise projects, and they are often the innovation behind a new way of business thinking. The traditional hand-coding approach to addressing the smaller data integration project is not-scalable, not-repeatable and prone to human error, what’s needed is a compact, flexible and powerful off-the-shelf tool.
Thankfully, over a century after the world embraced the Swiss Army Knife, someone at Informatica was paying attention to revolutionary ideas. If you’ve not yet heard the news about the Informatica platform, a version called PowerCenter Express has been released and it is free of charge so you can use it to handle an assortment of what I’d characterize as high complexity / low volume data integration challenges and experience a subset of the Informatica platform for yourself. I’d emphasize that PowerCenter Express doesn’t replace the need for Informatica’s enterprise grade products, but it is ideal for rapid prototyping, profiling data, and developing quick proof of concepts.
PowerCenter Express provides a glimpse of the evolving Informatica platform by integrating four Informatica products into a single, compact tool. There are no database dependencies and the product installs in just under 10 minutes. Much to my own surprise, I use PowerCenter express quite often going about the various aspects of my job with Informatica. I have it installed on my laptop so it travels with me wherever I go. It starts up quickly so it’s ideal for getting a little work done on an airplane.
For example, recently I wanted to explore building some rules for an upcoming proof of concept on a plane ride home so I could claw back some personal time for my weekend. I used PowerCenter Express to profile some data and create a mapping. And this mapping wasn’t something I needed to throw away and recreate in an enterprise version after my flight landed. Vibe, Informatica’s build once / run anywhere metadata driven architecture allows me to export a mapping I create in PowerCenter Express to one of the enterprise versions of Informatica’s products such as PowerCenter, DataQuality or Informatica Cloud.
As I alluded to earlier in this article, being a free offering I honestly didn’t expect too much from PowerCenter Express when I first started exploring it. However, due to my own positive experiences, I now like to think of PowerCenter Express as the Swiss Army Knife of Data Integration.
To start claiming back some of your personal time, get started with the free version of PowerCenter Express, found on the Informatica Marketplace at: https://community.informatica.com/solutions/pcexpress
- Does Data Integration technology truly provide a clear path toward unified data?
- Can businesses truly harness the potential of their information?
- Can companies take powerful action as a result?
Recently, Bloor Research set out to evaluate how things were actually playing out on the ground. In particular, they wanted to determine which data integration projects were actually taking place, at what scale, and with what results. The study, “Comparative Costs and Uses for Data Integration Platforms,” was authored by Philip Howard, research director at Bloor. The study examined data integration tool suitability across a range of scenarios, including:
- Data migration and consolidation projects
- Master data management (MDM) and associated solutions
- Application-to-application integration
- Data warehousing and business intelligence implementations
- Synching data with SaaS applications
- B2B data exchange
To draw conclusions, Bloor examined 292 responses from a range of companies. The responders used a variety of data integration approaches, from commercial data integration tools to “hand-coding.”
Informatica is pleased to be able to offer you a copy of this research for your review. The research covers areas like:
- Total Cost of Ownership (TCO)
We welcome you to download a copy of “Comparative Costs and Uses for Data Integration Platforms” today. We hope these findings offer you insights as you implement and evaluate your data integration projects and options.
Are you interested in Oracle Data Migration Best Practices? Are you upgrading, consolidating or migrating to or from an Oracle application? Moving to the cloud or a hosted service? Research and experience confirms that the tasks associated with migrating application data during these initiatives have the biggest impact on whether the project is considered a failure or success. So how do your peers ensure data migration success?
Informatica will be offering a full day Oracle Migrations Best Practices workshop at Oracle Application User Group’s annual conference, Collaborate 14, this year on April 7th in Las Vegas, NV. During this workshop, peers and experts will share best practices for how to avoid the pitfalls and ensure successful projects, lowering migration cost and risk. Our full packed agenda includes:
- Free use and trials of data migration tools and software
- Full training sessions on how to integrate cloud-based applications
- How to provision test data using different data masking techniques
- How to ensure consistent application performance during and after a migration
- A review of Oracle Migration Best Practices and case studies
Case Study: EMC
One of the key case studies that will be highlighted is EMC’s Oracle migration journey. EMC Corporation migrated to Oracle E-Business Suite, acquired more than 40 companies in 4 years, consolidated and retired environments, and is now on its path to migrating to SAP. Not only did they migrate applications, but they also migrated their entire technology platform from physical to virtual on their journey to the cloud. They needed to control the impact of data growth along the way, manage the size of their test environments while reducing the risk of exposing sensitive data to unauthorized users during development cycles. With best practices, and the help from Informatica, they estimate that they have saved approximately $45M in IT cost savings throughout their migrations. Now that they are deploying a new analytics platform based on Hadoop. They are leveraging existing skill sets and Informatica tools to ensure data is loaded into Hadoop without missing a beat.
Case Study: Verizon
Verizon is the second case study we will be discussing. They recently migrated to Salesforce.com and needed to ensure that more than 100 data objects were integrated with on-premises, back end applications. In addition, they needed to ensure that data was synchronized and kept secure in non-production environments in the cloud. They were able to leverage a cloud-based integration solution from Informatica to simplify their complex IT application architecture and maintain data availability and security – all while migrating a major business application to the cloud.
Case Study: OEM Heavy Equipment Manufacturer
The third case study we will review involves a well-known heavy equipment manufacturer who was facing a couple of challenges – the first was a need to separate data in in an Oracle E-Business Suite application as a result of a divestiture. Secondly, they also needed to control the impact of data growth on their production application environments that were going through various upgrades. Using an innovative approach based on Smart Partitioning, this enterprise estimates it will save $23M over a 5 year period while achieving 40% performance improvements across the board.
To learn more about what Informatica will be sharing at Collaborate 14, watch this video. If you are planning to attend Collaborate 14 this year and you are interested in joining us, you can register for the Oracle Migrations Best Practices Workshop here.
Data migration projects are notorious for going over budget and over time. These large projects typically cost around $875,000 and an average of 30% of that is due to project overruns. In today’s fast-paced, big data era, organizations cannot afford these missteps. Unfortunately, many companies treat major data projects as one-off events. This approach leads to product launch delays, produces no re-usable assets or best practices, and presents an outsized risk to business objectives.
I look forward to sharing how the successful organizations we work with have combated these issues using Master Data Management (MDM) as a platform for systems consolidation, migration, and upgrade projects. MDM accomplishes the following:
- Creates authoritative, trustworthy data
- Simplifies migration architecture using a hub-and-spoke model
- Maintains data consistency across new and old systems post-migration
- Enables reuse of data, mappings, and rules for the next migration project
In summary, MDM allows organizations to minimize risk and increase the speed of data migration.
To address this topic, I will be hosting a webinar titled “MDM as Platform for Systems Consolidation, Migration and Upgrade” on March 19th at 2:00 PM Eastern. In this webinar, you will learn about:
- Challenges faced in systems consolidation, migration and upgrades
- Solutions MDM brings to address these challenges in pre-migration, during-migration, and post-migration phases
- Examples of companies using MDM to manage data migration as a repeatable process
- Tips for expanding the use of MDM beyond data migration for operational and analytical purposes
Join me to learn how MDM works in practice and to gain understanding of how it can help make your next systems consolidation, migration, or upgrade the most efficient and effective yet. Sign up today for the webinar on Wednesday, March 19, 2014.
Informatica recently hosted a webinar with Cognizant who shared how they streamline test data management processes internally with Informatica Test Data Management and pass on the benefits to their customers. Proclaimed as the world’s largest Quality Engineering and Assurance (QE&A) service provider, they have over 400 customers and thousands of testers and are considered a thought leader in the testing practice.
We polled over 100 attendees on what their top challenges were with test data management considering the data and system complexities and the need to protect their client’s sensitive data. Here are the results from that poll:
It was not surprising to see that generating test data sets and securing sensitive data in non-production environments were tied as the top two biggest challenges. Data integrity/synchronization was a very close 3rd .
Cognizant with Informatica has been evolving its test data management offering to truly focus on not only securing sensitive data – but also improving testing efficiencies with identifying, provisioning and resetting test data – tasks that consume as much as 40% of testing cycle times. As part of the next generation test data management platform, key components of that solution include:
Sensitive Data Discovery – an integrated and automated process that searches data sets looking for exposed sensitive data. Many times, sensitive data resides in test copies unbeknownst to auditors. Once data has been located, data can be masked in non-production copies.
Persistent Data Masking – masks sensitive data in-flight while cloning data from production or in-place on a gold copy. Data formats are preserved while original values are completely protected.
Data Privacy Compliance Validation – auditors want to know that data has in fact been protected, the ability to validate and report on data privacy compliance becomes critical.
Test Data Management – in addition to creating test data subsets, clients require the ability to synthetically generate test data sets to eliminate defects by having data sets aligned to optimize each test case. Also, in many cases, multiple testers work on the same environment and may clobber each other’s test data sets. Having the ability to reset test data becomes a key requirement to improve efficiencies.
Figure 2 Next Generation Test Data Management
When asked what tools or services that have been deployed, 78% said in-house developed scripts/utilities. This is an incredibly time-consuming approach and one that has limited repeatability. Data masking was deployed in almost half of the respondents.
Informatica with Cognizant are leading the way to establishing a new standard for Test Data Management by incorporating both test data generation, data masking, and the ability to refresh or reset test data sets. For more information, check out Cognizant’s offering based on Informatica: TDMaxim and White Paper: Transforming Test Data Management for Increased Business Value.
A study by Bloor Research put the failure rate for data migration projects at 38%. When you consider that a failed data migration project can temporarily hold up vital business processes, this becomes even more bad news. This affects customer service, internal business processes, productivity, etc., leading to an IT infrastructure that is just not meeting the expectations of the business.
If you own one of these dysfunctional IT infrastructures, you’re not alone. Most enterprises struggle with the ability to manage the use of data within the business. Data integration becomes an ad hoc concept that is solved when needed using whatever works at the time. Moreover, the ability to manage migration and data quality becomes a lost art, and many users distrust the information coming from business systems they should rely upon.
The solution to this problem is complex. There needs to be a systemic approach to data integration that is led by key stakeholders. Several business objectives should be set prior to creating a strategy, approach, and purchasing key technologies. This includes:
- Define the cost of risk in having substandard data quality.
- Define the cost of risk in not having data available to systems and humans in the business.
- Define the cost of lost strategic opportunities, such as moving into a new product line or acquiring a company.
The idea is that, by leveraging data integration approaches and technology, we’ll reduce much of the risk, which actually has a cost.
The risk of data quality is obvious to those inside and out of IT, but the damage that could occur when not having a good data integration and data quality strategy and supporting technology is perhaps much farther reaching that many think. The trick is to solve both problems at the same time, leveraging data integration technology that can deal with data quality issues as well.
Not having data available to both end users who need to see it to operate the business, as well as to machines that need to respond to changing data, adds to the risk and thus the cost. In many enterprises, there is a culture of what I call “data starvation.” This means it’s just accepted that you can’t track orders with accurate data, you can’t pull up current customer sales information, and this is just the way things are. This is really an easy fix these days, and one dollar invested in creating a strategy or purchasing and implementing technology will come back to the business twenty fold, at least.
Finally, define the cost of lost strategic opportunities. This is a risk that many companies pay for, but it’s complex and difficult to define. This means that the inability to get the systems communicating and sharing data around a merger, for example, means that the enterprises can’t easily take advantage of market opportunities.
I don’t know how many times I’ve heard of enterprises failing at their attempts to merge two businesses because IT could not figure out how to the make the systems work and play well together. As with the other two risks, a manageable investment of time and money will remove this risk and thus the cost of the risk.