Working for Informatica has many advantages. One of them is that I clearly understand the difference between Product Information Management (PIM) and Master Data Management (MDM) for product data[i]. Since I have this clear in my own mind, it is easy to forget that this may not be as obvious to others. As frequently happens, it takes a customer to help us articulate why PIM is not the same as Product MDM. Now that this is fresh in my mind again, I thought I would share why the two are different, and when you should consider each one, or both.
In a lengthy discussion with our customer, many points were raised, discussed and classified. In the end, all arguments essentially came down to each technology’s primary purpose. A different primary purpose means that typical capabilities of the two products are geared towards different audiences and use cases.
PIM is a business application that centralizes and streamlines the creation and enhancement of consistent, but localised product content across channels. (Figure 1)
Figure 1: PIM Product Data Creation Flow
Product MDM is an infrastructure component that consolidates the core global product data that should be consistent across multiple and diverse systems and business processes, but typically isn’t. (Figure 2)
Figure 2: MDM Product Data Consolidation Hub
The choice between the two technologies really comes down the current challenge you are trying to solve. If you cannot get clean and consistent data out through all your sales channels fast enough, then a PIM solution is the correct choice for you. However, if your organisation is making poor decisions and seeing bloated costs (e.g. procurement or inventory costs) due to poor internal product data, then MDM technology is the right choice.
But, if it is so simple – why I am even writing this down? Why are the lines blurring now?
Here is my 3-part theory:
- A focus on good quality product data is relatively recent trend. Different industries started by addressing different challenges.
- PIM has primarily been used in retail B2C environments and distributor B2B or B2C environments. That is, organisations which are primarily focused around the sale of a product, rather than the design and production of the product.
- Product MDM has been used predominately by manufacturers of goods, looking to standardise and support global processes, reporting and analytics across departments.
- Now, manufacturers are increasingly looking to take control of their product information outside their organisation.
- This trend is most notable in Consumer Goods (CG) companies.
- Increasingly consistent, appealing and high quality data in the consumer realm is making the difference between choosing your product vs. a competitor’s.
- CG must ensure all channels – their own and their retail partner’s – are fed with high quality product data.
- So PIM is now entering organisations which should already have a Product MDM tool. If they don’t, confusion arises.
- When Marketing buys PIM (and it normally is Marketing), quite frankly this shows up the poor product data management upstream of marketing.
- It becomes quite tempting to try to jam as much product data into a PIM system as possible, going beyond the original scope of PIM.
The follow-on question is clear: why can’t we just make a few changes and use PIM as our MDM technology, or MDM as our PIM solution? It is very tempting. Both data models can be extended to add extra fields. In Informatica’s case, both are supported by a common, feature-rich workflow tool. However, there are inherent risks in using PIM where Product MDM is needed or Product MDM where PIM is needed.
After discussions with our customer, we identified 3 risks of modifying PIM when it is really Product MDM functionality that is needed:
- Decrease speed of PIM deployment
- Reduce marketing agility
- Risk of marketing abandoning the hybrid tool in the mid-term
The last turned out to be the least understood, but that doesn’t make it any less real. Since each of these risks deserves more explanation, I will discuss them in Part 2 of this Blog. (Still to be published)
In summary, PIM and Product MDM are designed to play different roles in the quest for the availability of high quality product data both internally and externally. There are risks and costs associated with modifying one to take on the role of the other. In many cases there is place for both PIM and MDM, but you will still need to choose a starting point. Each journey to high quality product data will be different, but the goal is still the same – to turn product data into business value.
I (or one of my colleagues in a city near you) will be happy to help you understand what the best starting point is for your organisation.
[i] In case you were wondering, this is not the benefit that I joined Informatica for.
“Victory won’t go to those with the most data. It will go to those who make the best use of data.” – Doug Henschen, Information Week, May 2014
But how do you actually make best use of your data and become one of the data success stories? If you are going to differentiate on data, you need to use your data to innovate. Common options include:
- New products & services which leverage a rich data set
- Different ways to sell & market existing products and services based on detailed knowledge
But there is no ‘app for that’. Think about it – if you can buy an application, you are already too late. Somebody else has identified a need and created a product they expect to sell repeatedly. Applications cannot provide you a competitive advantage if everyone has one. Most people agree they will not rise to the top because they have installed ERP, CRM, SRM, etc. So it will become with any applications which claim to win you market share and profits based on data. If you want to differentiate, you need to stay ahead of the application curve, and let your internal innovation drive you forward.
Simplistically this is a 4 step process:
- Assemble a team of innovative employees, match them with skilled data scientists
- Identify data-based differentiation opportunities
- Feed the team high quality data at the rate in which they need it
- Provide them tools for data analysis and integrating data into business processes as required
Leaving aside the simplicity of these steps for a process – there is one key change to a ‘normal’ IT project. Normally data provisioning is an afterthought during IT projects. Now it must take priority. Frequently data integration is poorly executed, and barely documented. Data quality is rarely considered during projects. Poor data provisioning is a direct cause of spaghetti charts which contribute to organisational inflexibility and poor data availability to the business. Does “It will take 6 months to make those changes” sound familiar?
We have been told Big Data will change our world; Data is a raw material; Data is the new oil.
The business world is changing. We are moving into a world where our data is one of our most valuable resources, especially when coupled with our internal innovation. Applications used to differentiate us, now they are becoming commodities to be replaced and upgraded, or new ones acquired as rapidly as our business changes.
I believe that in order to differentiate on data, an organisation needs to treat data as the valuable resource we all say it is. Data Agility, Management and Governance are the true differentiators of our era. This is a frustration for those trying to innovate, but locked in an inflexible data world, built at a time people still expected ERP to be the answer to everything.
To paraphrase a recent complaint I heard: “My applications should be like my phone. I buy a new one, turn it on and it already has all my data”.
This is the exact vision that is driving Informatica’s Intelligent Data Platform.
In the end, differentiating on data comes down to one key necessity: High quality data MUST be available to all who need it, when they need it.
Total Quality Management, as it relates to products and services has it’s roots in the 1920s. The 1960’s provided a huge boost with rise of guru’s such as Deming, Juran and Crosby. Whilst each had their own contribution, common principles for TQM that emerged in this era remain in practice today:
- Management (C-level management) is ultimately responsible for quality
- Poor quality has a cost
- The earlier in the process you address quality, the lower the cost of correcting it
- Quality should be designed into the system
So for 70 years industry in general has understood the cost of poor quality, and how to avoid these costs. So why is it that in 2014 I was party to a conversation that included the statement:
“I only came to the conference to see if you (Informatica) have solved the data quality problem.”
Ironically the TQM movement was only possible based on the analysis of data, but this is the one aspect that is widely ignored during TQM implementation. So much for ‘Total’ Quality Management.
This person is not alone in their thoughts. Many are waiting for the IT knight in shining armour, the latest and greatest data quality tools secured on their majestic steed, to ride in and save the day. Data quality dragon slayed, cold drinks all round, job done. This will not happen. Put data quality in the context of total quality management principles to see why: A single department cannot deliver data quality alone, regardless of the strength of their armoury.
I am not sure anyone would demand a guarantee of a high quality product from their machinery manufacturers. Communications providers cannot deliver high quality customer services organisations through technology alone. These suppliers have will have an influence on final product quality, but everyone understands equipment cannot deliver in isolation. Good quality raw materials, staff that genuinely takes pride in their work and the correct incentives are key to producing high quality products and services.
So why is there an expectation that data quality can be solved by tools alone?
At a minimum senior management support is required to push other departments to change their behaviour and/or values. So why aren’t senior management convinced that data quality is a problem worth their attention the way product & service quality is?
The fact that poor data quality has a high cost is reasonably well known via anecdotes. However, cost has not been well quantified, and hence fails to grab the attention of senior management. A 2005 paper by Richard Marsh[i] states: “Research and reports by industry experts, including Gartner Group, PriceWaterhouseCoopers and The Data Warehousing Institute clearly identify a crisis in data quality management and a reluctance among senior decision makers to do enough about it.” Little has changed since 2005.
However, we are living in a world where data generation, processing and consumption are increasing exponentially. With all the hype and investment in data, we face the grim prospect of fully embracing an age of data-driven-everything founded on a very poor quality raw material. Data quality is expected to be applied after generation, during the analytic phase. How much will that cost us? In order to function effectively, our new data-driven world must have high quality data running through every system and activity in an organization.
The Total Data Quality Movement is long overdue.
Only when every person in every organization understands the value of the data, do we have a chance of collectively solving the problem of poor data quality. Data quality must be considered from data generation, through transactional processing and analysis right until the point of archiving.
Informatica DQ supports IT departments in automating data correction where possible, and highlighting poor data for further attention where automation is not possible. MDM plays an important role in sustaining high quality data. Informatica tools empower the business to share the responsibility for total data quality.
We are ready for Total Data Quality, but continue to await the Total Data Quality Movement to get off the ground.
(If you do not have time to waiting for TDQM to gain traction, we can help you measure the cost of poor quality data in your organization to win corporate buy-in now.)
But it’s not as easy as a couple of queries. The reality is that the body of knowledge in question is seldom in a shape recognizable as a ‘body’. In most corporations, the data regulators are asking for is distributed throughout the organization. Perhaps a ‘Scattering of Knowledge’ is a more appropriate metaphor.
It is time to accept that data distribution is here to stay. The idea of a single ERP has long gone. Hype around Big Data is dying down, and being replaced by a focus on all data as a valuable asset. IT architectures are becoming more complex as additional data storage and data fueled applications are introduced. In fact, the rise of Data Governance’s profile within large organizations is testament to the acceptance of data distribution, and the need to manage it. Forrester has just released their first Forrester Wave ™ on data governance. They state it is time to address governance as “Data-driven opportunities for competitive advantage abound. As a consequence, the importance of data governance — and the need for tooling to facilitate data governance —is rising.” (Informatica is recognized as a Leader)
However, Data Governance Programs are not yet as widespread as they should be. Unfortunately it is hard to directly link strong Data Governance to business value. This means trouble getting a senior exec to sponsor the investment and cultural change required for strong governance. Which brings me back to the opportunity within Regulatory Compliance. My thinking goes like this:
- Regulatory compliance is often about gathering and submitting high quality data
- This is hard as the data is distributed, and the quality may be questionable
- Tools are required to gather, cleanse, manage and submit data for compliance
- There is a high overlap of tools & processes for Data Governance and Regulatory Compliance
So – why not use Regulatory Compliance as an opportunity to pilot Data Governance tools, process and practice?
Far too often compliance is a once-off effort with a specific tool. This tool collects data from disparate sources, with unknown data quality. The underlying data processes are not addressed. Strong Governance will have a positive effect on compliance – continually increasing data access and quality, and hence reducing the cost and effort of compliance. Since the cost of non-compliance is often measured in millions, getting exec sponsorship for a compliance-based pilot may be easier than for a broader Data Governance project. Once implemented, lessons learned and benefits realized can be leveraged to expand Data Governance into other areas.
Previously I likened Regulatory Compliance as a Buy One, Get One Free opportunity: Compliance + a free performance boost. If you use your compliance budget to pilot Data Governance – the boost will be larger than simply implementing Data Quality and MDM tools. The business case shouldn’t be too hard to build. Consider that EY’s research shows that companies that successfully use data are already outperforming their peers by as much as 20%.[i]
Data Governance Benefit = (Cost of non-compliance + 20% performance boost) – compliance budget
Yes, the equation can be considered simplistic. But it is compelling.
Regardless of the industry, new regulatory compliance requirements are more often than not treated like the introduction of a new tax. A few may be supportive, some will see the benefits, but most will focus on the negatives – the cost, the effort, the intrusion into private matters. There will more than likely be a lot of grumbling.
Across many industries there is currently a lot of grumbling, as new regulation seems to be springing up all over the place. Pharmaceutical companies have to deal with IDMP in Europe and UDI in the USA. This is hot on the heels of the US Sunshine Act, which is being followed in Europe by Aggregate Spend requirements. Consumer Goods companies in Europe are looking at the consequences of beefed up 1169 requirements. Financial Institutes are mulling over compliance to BCBS-239. Behind the grumbling most organisations across all verticals appear to have a similar approach to regulatory compliance. The pattern seems to go like this:
- Delay (The requirements may change)
- Scramble (They want it when? Why didn’t we get more time?)
- Code to Spec (Provide exactly what they want, and only what they want)
No wonder these requirements are seen as purely a cost and an annoyance. But it doesn’t have to be that way, and in fact, it should not. Just like I have seen a pattern in response to compliance, I see a pattern in the requirements themselves:
- The regulators want data
- Their requirements will change
- When they do change, regulators will be wanting even more data!
Now read the last 3 bullet points again, but use ‘executives’ or ‘management’ or ‘the business people’ instead of ‘regulators’. The pattern still holds true. The irony is that execs will quickly sign off on budget to meet regulatory requirements, but find it hard to see the value in “infrastructure” projects. Projects that will deliver this same data to their internal teams.
This is where the opportunity comes in. pwc’s 2013 State of Compliance Report[i] shows that over 42% of central compliance budgets are in excess of $1m. A significant figure. Efforts outside of the compliance team imply a higher actual cost. Large budgets are not surprising in multi-national companies, who often have to satisfy multiple regulators in a number of countries. As an alternate to multiple over-lapping compliance projects, what if this significant budget was repurposed to create a flexible data management platform? This approach could deliver compliance, but provide even more value internally.
Almost all internal teams are currently clamouring for additional data to drive ther newest application. Pharma and CG sales & marketing teams would love ready access to detailed product information. So would consumer and patient support staff, as well as down-stream partners. Trading desks and client managers within Financial Institutes should really have real-time access to their risk profiles guiding daily decision making. These data needs will not be going away. Why should regulators be prioritised over the people who drive your bottom line and who are guardians of your brand?
A flexible data management platform will serve everyone equally. Foundational tools for a flexible data management platform exist today including Data Quality, MDM, PIM and VIBE, Informatica’s Virtual Data Machine. Each of them play a significant role in easing of regulatory compliance, and as a bonus they deliver measureable business value in their own right. Implemented correctly, you will get enhanced data agility & visibility across the entire organisation as part of your compliance efforts. Sounds like ‘Buy one Get One Free’, or BOGOF in retail terms.
Unlike taxes, BOGOF opportunities are normally embraced with open arms. Regulatory compliance should receive a similar welcome – an opportunity to build the foundations for universal delivery of data which is safe, clean and connected. A 2011 study by The Economist found that effective regulatory compliance benefits businesses across a wide range of performance metrics[ii].
Is it time to get your free performance boost?
In almost all cases, poor quality master data is not a laughing matter. It can directly lead to losing out on millions either through excess costs or lost opportunity. However, there are occasions where poor data has a lighter side. One such case happened in the mid ‘90s whilst I was implementing ERP for a living.
On the first phase of a major project, I was on the master data team – in particular developing processes that enabled master data creation and maintenance. Since both MDM and workflow tools were in their infancy, we conjured up a manual processes supported by green screen based scripts and email. Not great, but workable. However, before go-live, system test was proving painful, with every other team wanting lots of material numbers for testing. Since manual entry excited no-one, so we created a product master file, populated by cut-and-paste, which the IT guys happily automated for us to create hundreds of material masters.
Testing was a success, but then something went wrong. Actually two things went wrong – the first being the script was designed to load test data and had no data quality reviews, but somehow it was used to load production data. This brought on the second problem which our master data team found highly amusing.
A couple of days after go-live, I got a call from ‘Phil’ – the shift supervisor in shipping.
After the usual pleasantries, it was clear that something was really bothering Phil, and he had been grappling with a problem for 45min or so. It came down to this:
“No matter how hard we try, we cannot fit 32 desktops on a pallet, even if we shrink wrap it”
I was a bit confused – pallets were sized for 32 laptops, or 16 desktops. Why would Phil be trying to put 32 desktops on a single pallet? Some brief queries showed there was an error in the master data. Since system test did not actually involve any physical actions in the real world, nobody noticed that our master data was inaccurate, and all products were defined as 32 per pallet. This was the script used to load data into the new ERP.
I suggested Phil continue as usual (16 per pallet), regardless of what the system said in terms of items per pallet, and I would raise a support request to get the data fixed.
I still try to imagine how many warehouse employees it takes to hold 32 desktops on a pallet designed for 16, whilst another armed with a portable shrink-wrapper desperately tries to wrap desktops, but not hands or whole people onto the pallet. I imagine all would be cursing ‘the new system’ loudly during the process.
There are a few important lessons out of this, some of which are:
- Without the correct tools and care, your data will quickly be infested with inaccuracies
- New systems are not immune to poor data quality (and may be at greater risk)
- Appointed data custodians should care about the data and have a stake in it’s accuracy
And perhaps most interestingly:
People will believe the system, even if their instinct tells them otherwise.
Which is probably one of the best reasons to ensure your data is correct. This was a clear demonstration of how poor data quality can directly & negatively affect daily business processes.
A real pity that this incident predated smart phones as well as Master Data Management and workflow tools. If it happened today, I may have had a great photo to remind me of the importance of accurate master data.
For the past few years, the press has been buzzing about the potential value of Big Data. However, there is little coverage focusing on the data itself – how do you get it, is it accurate, and who can be trusted with it?
We are the source of data that is often spoken about – our children, friends and relatives and especially those people we know on Facebook or LinkedIn. Over 40% of Big Data projects are in the sales and marketing arena – relying on personal data as a driving force. While machines have no choice but to provide data when requested, people do have a choice. We can choose not to provide data, or to purposely obscure our data, or to make it up entirely.
So, how can you ensure that your organization is receiving real information? Active participation is needed to ensure a constant flow of accurate data to feed your data-hungry algorithms and processes. While click-stream analysis does not require individual identification, follow-up sales & marketing campaigns will have limited value if the public at large is using false names and pretend information.
BCG has identified a link between trust and data sharing:
“We estimate that those that manage this issue well [creating trust] should be able to increase the amount of consumer data they can access by at least five to ten times in most countries.”[i]
With that in mind, how do you create the trust that will entice people to share data? The principles behind common data privacy laws provide guidelines. These include: accountability, purpose identification and disclosure, collection with knowledge and consent, data accuracy, individual access and correction, as well as the right to be forgotten.
But there are challenges in personal data stewardship – in part because the current world of Big Data analysis is far from stable. In the ongoing search for the value of Big Data, new technologies, tools and approaches are being piloted. Experimentation is still required which means moving data around between data storage technologies and analytical tools, and giving unprecedented access to data in terms of quantity, detail and variety to ever growing teams of analysts. This experimentation should not be discouraged, but it must not degrade the accuracy or security of your customers’ personal data.
How do you measure up? If I made contact and asked for the sum total of what you knew about me, and how my data was being used – how long would it take to provide this information? Would I be able to correct my information? How many of your analysts can view my personal data and how many copies have you distributed in your IT landscape? Are these copies even accurate?
Through our data quality, data mastering and data masking tools, Informatica can deliver a coordinated approach to managing your customer’s personal data and build trust by ensuring the safety and accuracy of that data. With Informatica managing your customer’s data, your internal team can focus their attention on analytics. Analytics from accurate data can help develop the customer loyalty and engagement that is vital to both the future security of your business and continued collection of accurate data to feed your Big Data analysis.
[i] The Trust Advantage: How to Win with Big Data; bcg.perspectives November 2013
“If I had my way, I’d fire the statisticians – all of them – they don’t add value”.
Surely not? Why would you fire the very people who were employed to make sense of the vast volumes of manufacturing data and guide future production? But he was right. The problem was at that time data management was so poor that data was simply not available for the statisticians to analyze.
So, perhaps this title should be re-written to be:
Fire your Data Scientists – They Aren’t Able to Add Value.
Although this statement is a bit extreme, the same situation may still exist. Data scientists frequently share frustrations such as:
- “I’m told our data is 60% accurate, which means I can’t trust any of it.”
- “We achieved our goal of an answer within a week by working 24 hours a day.”
- “Each quarter we manually prepare 300 slides to anticipate all questions the CFO may ask.”
- “Fred manually audits 10% of the invoices. When he is on holiday, we just don’t do the audit.”
This is why I think the original quote is so insightful. Value from data is not automatically delivered by hiring a statistician, analyst or data scientist. Even with the latest data mining technology, one person cannot positively influence a business without the proper data to support them.
Most organizations are unfamiliar with the structure required to deliver value from their data. New storage technologies will be introduced and a variety of analytics tools will be tried and tested. This change is crucial for to success. In order for statisticians to add value to a company, they must have access to high quality data that is easily sourced and integrated. That data must be available through the latest analytics technology. This new ecosystem should provide insights that can play a role in future production. Staff will need to be trained, as this new data will be incorporated into daily decision making.
With a rich 20-year history, Informatica understands data ecosystems. Employees become wasted investments when they do not have access to the trusted data they need in order to deliver their true value.
Who wants to spend their time recreating data sets to find a nugget of value only to discover it can’t be implemented?
Build a analytical ecosystem with a balanced focus on all aspects of data management. This will mean that value delivery is limited only by the imagination of your employees. Rather than questioning the value of an analytics team, you will attract some of the best and the brightest. Then, you will finally be able to deliver on the promised value of your data.