I think I may have gone to too many conferences in 2014 in which the potential of big data was discussed. After a while all the stories blurred into two main themes:
- Companies have gone bankrupt at a time when demand for their core products increased.
- Data from mobile phones, cars and other machines house a gold mine of value – we should all be using it.
My main take away from 2014 conferences was that no amount of data is a substitute for poor strategy, or lack of organisational agility to adapt business processes in times of disruption. However, I still feel as an industry our stories are stuck in the phase of ‘Big Data Hype’, but most organisations are beyond the hype and need practicalities, guidance and inspiration to turn their big data projects into a success. This is possibly due to a limited number of big data projects in production, or perhaps it is too early to measure the long term results of existing projects. Another possibility is that the projects are delivering significant competitive advantage, so the stories will remain under wraps for the time being.
However, towards the end of 2014 I stumbled across a big data success story in an unexpected place. It did (literally) provide competitive advantage, and since it has been running for a number of years the results are plain to see. It started with a book recommendation from a friend. ‘Faster’ by Michael Hutchinson is written as a self-propelled investigation as to the difference between world champion and world class althletes. It promised to satisfy my slightly geeky tendency to enjoy facts, numerical details and statistics. It did this – but it really struck me as a ‘how-to’ guide for big data projects.
Mr Hutchinson’s book is an excellent read as an insight into professional cycling by a professional cyclist. It is stacked with interesting facts and well-written anecdotes, and I highly recommend the reading the book. Since the big-data aspect was a sub-plot, I will pull out the highlights without distracting from the main story.
Here are the five steps I extracted for big data project success:
1. Have a clear vision and goal for your project
The Sydney Olympics in 2000 had only produced 4 medals across all cycling disciplines for British cyclists. With a home Olympics set for 2012, British Cycling desperately wanted to improve this performance. Specific targets were clearly set across all disciplines stated in times that an athlete needed to achieve in order to win a race.
2. Determine data the required to support these goals
Unlike many big data projects which start with a data set and then wonder what to do with it, British Cycling did this the other way around. They worked out what they needed to measure in order to establish the influencers on their goal (track time) and set about gathering this information. In their case this involved gathering wind tunnel data to compare & contrast equipment, as well as physiological data from athletes and all information from cycling activities.
3. Experiment in order to establish causality
Most big data projects involve experimentation by changing the environment whilst gathering a sub-set of data points. The number of variables to adjust in cycling is large, but all were embraced. Data (including video) was gathered on the effects of small changes in each component: Bike, Clothing, Athlete (training and nutrition).
4. Guide your employees on how to use the results of the data
Like many employees, cyclists and coaches were convinced of the ‘best way’ to achieve results based on their own personal experience. Analysis of data in some cases showed that the perceived best way, was in fact not the best way. Coaching staff trusted the data, and convinced the athletes to change aspects of both training and nutrition. This was not necessarily easy to do, as it could mean fundamental changes in the athlete’s lifestyle.
5. Embrace innovation
Cycling is a very conservative sport by nature, with many of the key innovations coming from adjacent sports such as triathlon. Data however, is not steeped in tradition and does not have pre-conceived ideas as to what equipment should look like, or what constitutes an excellent recovery drink. What made British Cycling’s big data initiatives successful is that they allowed themselves to be guided by the data and put the recommendations into practice. Plastic finished skin suits are probably not the most obvious choice for clothing, but they proved to be the biggest advantage cyclist could get. Far more than tinkering with the bike. (In fact they produced so much advantage they were banned shortly after the 2008 Olympics.)
The results: British Cycling won 4 Olympic medals in 2000, one of which was gold. In 2012 they grabbed 8 gold, 2 silver and 2 bronze medals. A quick glance at their website shows that it is not just Olympic medals they are wining – but medals won across all world championship events has increased since 2000.
To me, this is one of the best big data stories, as it directly shows how to be successful using big data strategies in a completely analogue world. I think it is more insightful that the mere fact that we are producing ever-increasing volumes of data. The real value of big data is in understanding what portion of all avaiable data will constribute to you acieving your goals, and then embracing the use the results of analysis to make constructive changes in daily activities.
But then again, I may just like the story because it involves geeky facts, statistics and fast bicycles.
At the DIA conference in Berlin this month, Frits Stulp of Mesa Arch Consulting suggested that IDMP could get the business asking for MDM. After looking at the requirements for IDMP compliance for approximately a year, his conclusion from a business point of view is that MDM has a key role to play in IDMP compliance. A recent press release by Andrew Marr, an IDMP and XEVMPD expert and specialist consultant, also shows support for MDM being ‘an advantageous thing to do’ for IDMP compliance. A previous blog outlined my thoughts on why MDM can turn regulatory compliance into an opportunity, instead of a cost. It seems that others are now seeing this opportunity too.
So why will IDMP enable the business (primarily regulatory affairs) to come to the conclusion that they need MDM? At its heart, IDMP is a pharmacovigilance initiative which has a goal to uniquely identify all medicines globally, and have rapid access to the details of the medicine’s attributes. If implemented in its ideal state, IDMP will deliver a single, accurate and trusted version of a medicinal product which can be used for multiple analytical and procedural purposes. This is exactly what MDM is designed to do.
Here is a summary of the key reasons why an MDM-based approach to IDMP is such a good fit.
1. IDMP is a data Consolidation effort; MDM enables data discovery & consolidation
- IDMP will probably need to populate between 150 to 300 attributes per medicine
- These attributes will be held in 10 to 13 systems, per product.
- MDM (especially with close coupling to Data Integration) can easily discover and collect this data.
2. IDMP requires cross-referencing; MDM has cross-referencing and cleansing as key process steps.
- Consolidating data from multiple systems normally means dealing with multiple identifiers per product.
- Different entities must be linked to each other to build relationships within the IDMP model.
- MDM allows for complex models catering for multiple identifiers and relationships between entities.
3. IDMP submissions must ensure the correct value of an attribute is submitted; MDM has strong capabilities to resolve different attribute values.
- Many attributes will exist in more than one of the 10 to 13 source systems
- Without strong data governance, these values can (and probably will be) different.
- MDM can set rules for determining the ‘golden source’ for each attribute, and then track the history of these values used for submission.
4. IDMP is a translation effort; MDM is designed to translate
- Submission will need to be within a defined vocabulary or set of reference data
- Different regulators may opt for different vocabularies, in addition to the internal set of reference data.
- MDM can hold multiple values/vocabularies for entities, depending on context.
5. IDMP is a large co-ordination effort; MDM enables governance and is generally associated with higher data consistency and quality throughout an organisation.
- The IDMP scope is broad, so attributes required by IDMP may also be required for compliance to other regulations.
- Accurate compliance needs tracking and distribution of attribute values. Attribute values submitted for IDMP, other regulations, and supporting internal business should be the same.
- Not only is MDM designed to collect and cleanse data, it is equally comfortable for data dispersion and co-ordination of values across systems.
Once business users assess the data management requirements, and consider the breadth of the IDMP scope, it is no surprise that some of them could be asking for a MDM solution. Even if they do not use the acronym ‘MDM’ they could actually be asking for MDM by capabilities rather than name.
Given the good technical fit of a MDM approach to IDMP compliance, I would like to put forward three arguments as to why the approach makes sense. There may be others, but these are the ones I feel are most compelling:
1. Better chance to meet tight submission time
There is slightly over 18 months left before the EMA requires IDMP compliance. Waiting for final guidance will not provide enough time for compliance. Using MDM you have a tool to begin with the most time consuming tasks: data discovery, collection and consolidation. Required XEVMPD data, and the draft guidance can serve as a guide as to where to focus your efforts.
2. Reduce Risk of non-compliance
With fines in Europe of ‘fines up to 5% of revenue’ at stake, risking non-compliance could be expensive. Not only will MDM increase your chance of compliance on July 1, 2016, but will give you a tool to manage your data to ensure ongoing compliance in terms of meeting deadlines for delivering new data, and data changes.
3. Your company will have a ready source of clean, multi-purpose product data
Unlike some Regulatory Information Management tools, MDM is not a single-purpose tool. It is specifically designed to provide consolidated, high-quality master data to multiple systems and business processes. This data source could be used to deliver high-quality data to multiple other initiatives, in particular compliance to other regulations, and projects addressing topics such as Traceability, Health Economics & Outcomes, Continuous Process Verification, Inventory Reduction.
So back to the original question – will the introduction of IDMP regulation in Europe result in the business asking IT to implement MDM? Perhaps they will, but not by name. It is still possible that they won’t. However, for those of you who have been struggling to get buy-in to MDM within your organisation, and you need to comply to IDMP, then you may be able to find some more allies (potentially with an approved budget) to support you in your MDM efforts.
Part 1 of this blog touched on the differences between PIM and Product MDM. Since both play a role in ensuring the availability of high quality product data, it is easy to see the temptation to extend the scope of either product to play a more complete part. However, there are risks involved in customising software. PIM and MDM are not exceptions, and any customisations will carry some risk.
In the specific case of looking to extend the role of PIM, the problems start if you just look at the data and think: “oh, this is just a few more product attributes to add”. This will not give you a clear picture of the effort or risk associated with customisations. A complete picture requires looking beyond the attributes as data fields, and considering them in context: which processes and people (roles) are supported by these attributes?
Recently we were asked to assess the risk of PIM customisation for a customer. The situation was that data to be included in PIM was currently housed in separate, home grown and aging legacy systems. One school of thought was to move all the data, and their management tasks, into PIM and retire the three systems. That is, extending the role of PIM beyond a marketing application and into a Product MDM role. In this case, we found three main risks of customising PIM for this purpose. Here they are in more detail:
1. Decrease speed of PIM deployment
- Inclusion of the functionality (not just the data) will require customisations in PIM, not just additional attributes in the data model.
- Logic customisations are required for data validity checks, and some value calculations.
- Additional screens, workflows, integrations and UI customisations will be required for non-marketing roles
- PIM will become the source for some data, which is used in critical operational systems (e.g. SAP). Reference checks & data validation cannot be taken lightly due to risks of poor data elsewhere.
- Bottom line: A non-standard deployment with drive up implementation cost, time and risk.
2. Reduce marketing agility
- In the case concerned, whilst the additional data was important to marketing, it is primarily supporting by non-marketing users and processes including Product Development, Sales and Manufacturing
- These systems are key systems in their workflow in terms of creating and distributing technical details of new products to other systems, e.g. SAP for production
- If the systems are retired and replaced with PIM, these non-marketing users will need to be equal partners in PIM:
- Require access and customised roles
- Influence over configuration
- Equal vote in feature/function prioritisation
- Bottom Line: Marketing will no longer completely own the PIM system, and may have to sacrifice new functionality to prioritise supporting other roles.
3. Risk of marketing abandoning the hybrid tool in the mid-term
- An investment in PIM is usually an investment by Marketing to help them rapidly adapt to a dynamic external market.
- System agility (point 2) is key to rapid adaption, as is the ability to take advantage of new features within any packaged application.
- As more customisations are made, the cost of upgrades can become prohibitive, driven by the cost to upgrade customisations.
- Cost often driven by consulting fees to change what could be poorly documented code.
- Risk of falling behind on upgrades, and hence sacrificing access to the newest PIM functionality
- If upgrades are more expensive than new tools, PIM will be abandoned by Marketing, and they will invest in a new tool.
- Bottom line: In a worst case scenario, a customised PIM solution could be left supporting non-marketing functionality with Marketing investing in a new tool.
The first response to the last bullet point is normally “no they wouldn’t”. Unfortunately this is a pattern both I and some of my colleagues have seen in the area of marketing & eCommerce applications. The problem is that these areas are so fast moving, that nobody can afford to fall behind in terms of new functionality. If upgrades are large projects which need lengthy approval and implementation cycles, marketing is unlikely to wait. It is far easier to start again with a smaller budget under their direct control. (Which is where PIM should be in the first place.)
- Making PIM look and behave like Product MDM could have some undesirable consequences – both in the short term (current deployment) and in the longer term (application abandonment).
- A choice for customising PIM vs. enhancing your landscape with Product MDM should be made not on data attributes alone.
- Your business and data processes should guide you in terms of risk assessment for customisation of your PIM solution.
Bottom Line: If the risks seem too large, then consider enhancing your IT landscape with Product MDM. Trading PIM cost & risk for measurable business value delivered by MDM will make a very attractive business case.
Working for Informatica has many advantages. One of them is that I clearly understand the difference between Product Information Management (PIM) and Master Data Management (MDM) for product data[i]. Since I have this clear in my own mind, it is easy to forget that this may not be as obvious to others. As frequently happens, it takes a customer to help us articulate why PIM is not the same as Product MDM. Now that this is fresh in my mind again, I thought I would share why the two are different, and when you should consider each one, or both.
In a lengthy discussion with our customer, many points were raised, discussed and classified. In the end, all arguments essentially came down to each technology’s primary purpose. A different primary purpose means that typical capabilities of the two products are geared towards different audiences and use cases.
PIM is a business application that centralizes and streamlines the creation and enhancement of consistent, but localised product content across channels. (Figure 1)
Figure 1: PIM Product Data Creation Flow
Product MDM is an infrastructure component that consolidates the core global product data that should be consistent across multiple and diverse systems and business processes, but typically isn’t. (Figure 2)
Figure 2: MDM Product Data Consolidation Hub
The choice between the two technologies really comes down the current challenge you are trying to solve. If you cannot get clean and consistent data out through all your sales channels fast enough, then a PIM solution is the correct choice for you. However, if your organisation is making poor decisions and seeing bloated costs (e.g. procurement or inventory costs) due to poor internal product data, then MDM technology is the right choice.
But, if it is so simple – why I am even writing this down? Why are the lines blurring now?
Here is my 3-part theory:
- A focus on good quality product data is relatively recent trend. Different industries started by addressing different challenges.
- PIM has primarily been used in retail B2C environments and distributor B2B or B2C environments. That is, organisations which are primarily focused around the sale of a product, rather than the design and production of the product.
- Product MDM has been used predominately by manufacturers of goods, looking to standardise and support global processes, reporting and analytics across departments.
- Now, manufacturers are increasingly looking to take control of their product information outside their organisation.
- This trend is most notable in Consumer Goods (CG) companies.
- Increasingly consistent, appealing and high quality data in the consumer realm is making the difference between choosing your product vs. a competitor’s.
- CG must ensure all channels – their own and their retail partner’s – are fed with high quality product data.
- So PIM is now entering organisations which should already have a Product MDM tool. If they don’t, confusion arises.
- When Marketing buys PIM (and it normally is Marketing), quite frankly this shows up the poor product data management upstream of marketing.
- It becomes quite tempting to try to jam as much product data into a PIM system as possible, going beyond the original scope of PIM.
The follow-on question is clear: why can’t we just make a few changes and use PIM as our MDM technology, or MDM as our PIM solution? It is very tempting. Both data models can be extended to add extra fields. In Informatica’s case, both are supported by a common, feature-rich workflow tool. However, there are inherent risks in using PIM where Product MDM is needed or Product MDM where PIM is needed.
After discussions with our customer, we identified 3 risks of modifying PIM when it is really Product MDM functionality that is needed:
- Decrease speed of PIM deployment
- Reduce marketing agility
- Risk of marketing abandoning the hybrid tool in the mid-term
The last turned out to be the least understood, but that doesn’t make it any less real. Since each of these risks deserves more explanation, I will discuss them in Part 2 of this Blog. (Still to be published)
In summary, PIM and Product MDM are designed to play different roles in the quest for the availability of high quality product data both internally and externally. There are risks and costs associated with modifying one to take on the role of the other. In many cases there is place for both PIM and MDM, but you will still need to choose a starting point. Each journey to high quality product data will be different, but the goal is still the same – to turn product data into business value.
I (or one of my colleagues in a city near you) will be happy to help you understand what the best starting point is for your organisation.
[i] In case you were wondering, this is not the benefit that I joined Informatica for.
“Victory won’t go to those with the most data. It will go to those who make the best use of data.” – Doug Henschen, Information Week, May 2014
But how do you actually make best use of your data and become one of the data success stories? If you are going to differentiate on data, you need to use your data to innovate. Common options include:
- New products & services which leverage a rich data set
- Different ways to sell & market existing products and services based on detailed knowledge
But there is no ‘app for that’. Think about it – if you can buy an application, you are already too late. Somebody else has identified a need and created a product they expect to sell repeatedly. Applications cannot provide you a competitive advantage if everyone has one. Most people agree they will not rise to the top because they have installed ERP, CRM, SRM, etc. So it will become with any applications which claim to win you market share and profits based on data. If you want to differentiate, you need to stay ahead of the application curve, and let your internal innovation drive you forward.
Simplistically this is a 4 step process:
- Assemble a team of innovative employees, match them with skilled data scientists
- Identify data-based differentiation opportunities
- Feed the team high quality data at the rate in which they need it
- Provide them tools for data analysis and integrating data into business processes as required
Leaving aside the simplicity of these steps for a process – there is one key change to a ‘normal’ IT project. Normally data provisioning is an afterthought during IT projects. Now it must take priority. Frequently data integration is poorly executed, and barely documented. Data quality is rarely considered during projects. Poor data provisioning is a direct cause of spaghetti charts which contribute to organisational inflexibility and poor data availability to the business. Does “It will take 6 months to make those changes” sound familiar?
We have been told Big Data will change our world; Data is a raw material; Data is the new oil.
The business world is changing. We are moving into a world where our data is one of our most valuable resources, especially when coupled with our internal innovation. Applications used to differentiate us, now they are becoming commodities to be replaced and upgraded, or new ones acquired as rapidly as our business changes.
I believe that in order to differentiate on data, an organisation needs to treat data as the valuable resource we all say it is. Data Agility, Management and Governance are the true differentiators of our era. This is a frustration for those trying to innovate, but locked in an inflexible data world, built at a time people still expected ERP to be the answer to everything.
To paraphrase a recent complaint I heard: “My applications should be like my phone. I buy a new one, turn it on and it already has all my data”.
This is the exact vision that is driving Informatica’s Intelligent Data Platform.
In the end, differentiating on data comes down to one key necessity: High quality data MUST be available to all who need it, when they need it.
Total Quality Management, as it relates to products and services has it’s roots in the 1920s. The 1960’s provided a huge boost with rise of guru’s such as Deming, Juran and Crosby. Whilst each had their own contribution, common principles for TQM that emerged in this era remain in practice today:
- Management (C-level management) is ultimately responsible for quality
- Poor quality has a cost
- The earlier in the process you address quality, the lower the cost of correcting it
- Quality should be designed into the system
So for 70 years industry in general has understood the cost of poor quality, and how to avoid these costs. So why is it that in 2014 I was party to a conversation that included the statement:
“I only came to the conference to see if you (Informatica) have solved the data quality problem.”
Ironically the TQM movement was only possible based on the analysis of data, but this is the one aspect that is widely ignored during TQM implementation. So much for ‘Total’ Quality Management.
This person is not alone in their thoughts. Many are waiting for the IT knight in shining armour, the latest and greatest data quality tools secured on their majestic steed, to ride in and save the day. Data quality dragon slayed, cold drinks all round, job done. This will not happen. Put data quality in the context of total quality management principles to see why: A single department cannot deliver data quality alone, regardless of the strength of their armoury.
I am not sure anyone would demand a guarantee of a high quality product from their machinery manufacturers. Communications providers cannot deliver high quality customer services organisations through technology alone. These suppliers have will have an influence on final product quality, but everyone understands equipment cannot deliver in isolation. Good quality raw materials, staff that genuinely takes pride in their work and the correct incentives are key to producing high quality products and services.
So why is there an expectation that data quality can be solved by tools alone?
At a minimum senior management support is required to push other departments to change their behaviour and/or values. So why aren’t senior management convinced that data quality is a problem worth their attention the way product & service quality is?
The fact that poor data quality has a high cost is reasonably well known via anecdotes. However, cost has not been well quantified, and hence fails to grab the attention of senior management. A 2005 paper by Richard Marsh[i] states: “Research and reports by industry experts, including Gartner Group, PriceWaterhouseCoopers and The Data Warehousing Institute clearly identify a crisis in data quality management and a reluctance among senior decision makers to do enough about it.” Little has changed since 2005.
However, we are living in a world where data generation, processing and consumption are increasing exponentially. With all the hype and investment in data, we face the grim prospect of fully embracing an age of data-driven-everything founded on a very poor quality raw material. Data quality is expected to be applied after generation, during the analytic phase. How much will that cost us? In order to function effectively, our new data-driven world must have high quality data running through every system and activity in an organization.
The Total Data Quality Movement is long overdue.
Only when every person in every organization understands the value of the data, do we have a chance of collectively solving the problem of poor data quality. Data quality must be considered from data generation, through transactional processing and analysis right until the point of archiving.
Informatica DQ supports IT departments in automating data correction where possible, and highlighting poor data for further attention where automation is not possible. MDM plays an important role in sustaining high quality data. Informatica tools empower the business to share the responsibility for total data quality.
We are ready for Total Data Quality, but continue to await the Total Data Quality Movement to get off the ground.
(If you do not have time to waiting for TDQM to gain traction, we can help you measure the cost of poor quality data in your organization to win corporate buy-in now.)
But it’s not as easy as a couple of queries. The reality is that the body of knowledge in question is seldom in a shape recognizable as a ‘body’. In most corporations, the data regulators are asking for is distributed throughout the organization. Perhaps a ‘Scattering of Knowledge’ is a more appropriate metaphor.
It is time to accept that data distribution is here to stay. The idea of a single ERP has long gone. Hype around Big Data is dying down, and being replaced by a focus on all data as a valuable asset. IT architectures are becoming more complex as additional data storage and data fueled applications are introduced. In fact, the rise of Data Governance’s profile within large organizations is testament to the acceptance of data distribution, and the need to manage it. Forrester has just released their first Forrester Wave ™ on data governance. They state it is time to address governance as “Data-driven opportunities for competitive advantage abound. As a consequence, the importance of data governance — and the need for tooling to facilitate data governance —is rising.” (Informatica is recognized as a Leader)
However, Data Governance Programs are not yet as widespread as they should be. Unfortunately it is hard to directly link strong Data Governance to business value. This means trouble getting a senior exec to sponsor the investment and cultural change required for strong governance. Which brings me back to the opportunity within Regulatory Compliance. My thinking goes like this:
- Regulatory compliance is often about gathering and submitting high quality data
- This is hard as the data is distributed, and the quality may be questionable
- Tools are required to gather, cleanse, manage and submit data for compliance
- There is a high overlap of tools & processes for Data Governance and Regulatory Compliance
So – why not use Regulatory Compliance as an opportunity to pilot Data Governance tools, process and practice?
Far too often compliance is a once-off effort with a specific tool. This tool collects data from disparate sources, with unknown data quality. The underlying data processes are not addressed. Strong Governance will have a positive effect on compliance – continually increasing data access and quality, and hence reducing the cost and effort of compliance. Since the cost of non-compliance is often measured in millions, getting exec sponsorship for a compliance-based pilot may be easier than for a broader Data Governance project. Once implemented, lessons learned and benefits realized can be leveraged to expand Data Governance into other areas.
Previously I likened Regulatory Compliance as a Buy One, Get One Free opportunity: Compliance + a free performance boost. If you use your compliance budget to pilot Data Governance – the boost will be larger than simply implementing Data Quality and MDM tools. The business case shouldn’t be too hard to build. Consider that EY’s research shows that companies that successfully use data are already outperforming their peers by as much as 20%.[i]
Data Governance Benefit = (Cost of non-compliance + 20% performance boost) – compliance budget
Yes, the equation can be considered simplistic. But it is compelling.
Regardless of the industry, new regulatory compliance requirements are more often than not treated like the introduction of a new tax. A few may be supportive, some will see the benefits, but most will focus on the negatives – the cost, the effort, the intrusion into private matters. There will more than likely be a lot of grumbling.
Across many industries there is currently a lot of grumbling, as new regulation seems to be springing up all over the place. Pharmaceutical companies have to deal with IDMP in Europe and UDI in the USA. This is hot on the heels of the US Sunshine Act, which is being followed in Europe by Aggregate Spend requirements. Consumer Goods companies in Europe are looking at the consequences of beefed up 1169 requirements. Financial Institutes are mulling over compliance to BCBS-239. Behind the grumbling most organisations across all verticals appear to have a similar approach to regulatory compliance. The pattern seems to go like this:
- Delay (The requirements may change)
- Scramble (They want it when? Why didn’t we get more time?)
- Code to Spec (Provide exactly what they want, and only what they want)
No wonder these requirements are seen as purely a cost and an annoyance. But it doesn’t have to be that way, and in fact, it should not. Just like I have seen a pattern in response to compliance, I see a pattern in the requirements themselves:
- The regulators want data
- Their requirements will change
- When they do change, regulators will be wanting even more data!
Now read the last 3 bullet points again, but use ‘executives’ or ‘management’ or ‘the business people’ instead of ‘regulators’. The pattern still holds true. The irony is that execs will quickly sign off on budget to meet regulatory requirements, but find it hard to see the value in “infrastructure” projects. Projects that will deliver this same data to their internal teams.
This is where the opportunity comes in. pwc’s 2013 State of Compliance Report[i] shows that over 42% of central compliance budgets are in excess of $1m. A significant figure. Efforts outside of the compliance team imply a higher actual cost. Large budgets are not surprising in multi-national companies, who often have to satisfy multiple regulators in a number of countries. As an alternate to multiple over-lapping compliance projects, what if this significant budget was repurposed to create a flexible data management platform? This approach could deliver compliance, but provide even more value internally.
Almost all internal teams are currently clamouring for additional data to drive ther newest application. Pharma and CG sales & marketing teams would love ready access to detailed product information. So would consumer and patient support staff, as well as down-stream partners. Trading desks and client managers within Financial Institutes should really have real-time access to their risk profiles guiding daily decision making. These data needs will not be going away. Why should regulators be prioritised over the people who drive your bottom line and who are guardians of your brand?
A flexible data management platform will serve everyone equally. Foundational tools for a flexible data management platform exist today including Data Quality, MDM, PIM and VIBE, Informatica’s Virtual Data Machine. Each of them play a significant role in easing of regulatory compliance, and as a bonus they deliver measureable business value in their own right. Implemented correctly, you will get enhanced data agility & visibility across the entire organisation as part of your compliance efforts. Sounds like ‘Buy one Get One Free’, or BOGOF in retail terms.
Unlike taxes, BOGOF opportunities are normally embraced with open arms. Regulatory compliance should receive a similar welcome – an opportunity to build the foundations for universal delivery of data which is safe, clean and connected. A 2011 study by The Economist found that effective regulatory compliance benefits businesses across a wide range of performance metrics[ii].
Is it time to get your free performance boost?
In almost all cases, poor quality master data is not a laughing matter. It can directly lead to losing out on millions either through excess costs or lost opportunity. However, there are occasions where poor data has a lighter side. One such case happened in the mid ‘90s whilst I was implementing ERP for a living.
On the first phase of a major project, I was on the master data team – in particular developing processes that enabled master data creation and maintenance. Since both MDM and workflow tools were in their infancy, we conjured up a manual processes supported by green screen based scripts and email. Not great, but workable. However, before go-live, system test was proving painful, with every other team wanting lots of material numbers for testing. Since manual entry excited no-one, so we created a product master file, populated by cut-and-paste, which the IT guys happily automated for us to create hundreds of material masters.
Testing was a success, but then something went wrong. Actually two things went wrong – the first being the script was designed to load test data and had no data quality reviews, but somehow it was used to load production data. This brought on the second problem which our master data team found highly amusing.
A couple of days after go-live, I got a call from ‘Phil’ – the shift supervisor in shipping.
After the usual pleasantries, it was clear that something was really bothering Phil, and he had been grappling with a problem for 45min or so. It came down to this:
“No matter how hard we try, we cannot fit 32 desktops on a pallet, even if we shrink wrap it”
I was a bit confused – pallets were sized for 32 laptops, or 16 desktops. Why would Phil be trying to put 32 desktops on a single pallet? Some brief queries showed there was an error in the master data. Since system test did not actually involve any physical actions in the real world, nobody noticed that our master data was inaccurate, and all products were defined as 32 per pallet. This was the script used to load data into the new ERP.
I suggested Phil continue as usual (16 per pallet), regardless of what the system said in terms of items per pallet, and I would raise a support request to get the data fixed.
I still try to imagine how many warehouse employees it takes to hold 32 desktops on a pallet designed for 16, whilst another armed with a portable shrink-wrapper desperately tries to wrap desktops, but not hands or whole people onto the pallet. I imagine all would be cursing ‘the new system’ loudly during the process.
There are a few important lessons out of this, some of which are:
- Without the correct tools and care, your data will quickly be infested with inaccuracies
- New systems are not immune to poor data quality (and may be at greater risk)
- Appointed data custodians should care about the data and have a stake in it’s accuracy
And perhaps most interestingly:
People will believe the system, even if their instinct tells them otherwise.
Which is probably one of the best reasons to ensure your data is correct. This was a clear demonstration of how poor data quality can directly & negatively affect daily business processes.
A real pity that this incident predated smart phones as well as Master Data Management and workflow tools. If it happened today, I may have had a great photo to remind me of the importance of accurate master data.
For the past few years, the press has been buzzing about the potential value of Big Data. However, there is little coverage focusing on the data itself – how do you get it, is it accurate, and who can be trusted with it?
We are the source of data that is often spoken about – our children, friends and relatives and especially those people we know on Facebook or LinkedIn. Over 40% of Big Data projects are in the sales and marketing arena – relying on personal data as a driving force. While machines have no choice but to provide data when requested, people do have a choice. We can choose not to provide data, or to purposely obscure our data, or to make it up entirely.
So, how can you ensure that your organization is receiving real information? Active participation is needed to ensure a constant flow of accurate data to feed your data-hungry algorithms and processes. While click-stream analysis does not require individual identification, follow-up sales & marketing campaigns will have limited value if the public at large is using false names and pretend information.
BCG has identified a link between trust and data sharing:
“We estimate that those that manage this issue well [creating trust] should be able to increase the amount of consumer data they can access by at least five to ten times in most countries.”[i]
With that in mind, how do you create the trust that will entice people to share data? The principles behind common data privacy laws provide guidelines. These include: accountability, purpose identification and disclosure, collection with knowledge and consent, data accuracy, individual access and correction, as well as the right to be forgotten.
But there are challenges in personal data stewardship – in part because the current world of Big Data analysis is far from stable. In the ongoing search for the value of Big Data, new technologies, tools and approaches are being piloted. Experimentation is still required which means moving data around between data storage technologies and analytical tools, and giving unprecedented access to data in terms of quantity, detail and variety to ever growing teams of analysts. This experimentation should not be discouraged, but it must not degrade the accuracy or security of your customers’ personal data.
How do you measure up? If I made contact and asked for the sum total of what you knew about me, and how my data was being used – how long would it take to provide this information? Would I be able to correct my information? How many of your analysts can view my personal data and how many copies have you distributed in your IT landscape? Are these copies even accurate?
Through our data quality, data mastering and data masking tools, Informatica can deliver a coordinated approach to managing your customer’s personal data and build trust by ensuring the safety and accuracy of that data. With Informatica managing your customer’s data, your internal team can focus their attention on analytics. Analytics from accurate data can help develop the customer loyalty and engagement that is vital to both the future security of your business and continued collection of accurate data to feed your Big Data analysis.
[i] The Trust Advantage: How to Win with Big Data; bcg.perspectives November 2013