Tag Archives: Data Quality
Total Quality Management, as it relates to products and services has it’s roots in the 1920s. The 1960’s provided a huge boost with rise of guru’s such as Deming, Juran and Crosby. Whilst each had their own contribution, common principles for TQM that emerged in this era remain in practice today:
- Management (C-level management) is ultimately responsible for quality
- Poor quality has a cost
- The earlier in the process you address quality, the lower the cost of correcting it
- Quality should be designed into the system
So for 70 years industry in general has understood the cost of poor quality, and how to avoid these costs. So why is it that in 2014 I was party to a conversation that included the statement:
“I only came to the conference to see if you (Informatica) have solved the data quality problem.”
Ironically the TQM movement was only possible based on the analysis of data, but this is the one aspect that is widely ignored during TQM implementation. So much for ‘Total’ Quality Management.
This person is not alone in their thoughts. Many are waiting for the IT knight in shining armour, the latest and greatest data quality tools secured on their majestic steed, to ride in and save the day. Data quality dragon slayed, cold drinks all round, job done. This will not happen. Put data quality in the context of total quality management principles to see why: A single department cannot deliver data quality alone, regardless of the strength of their armoury.
I am not sure anyone would demand a guarantee of a high quality product from their machinery manufacturers. Communications providers cannot deliver high quality customer services organisations through technology alone. These suppliers have will have an influence on final product quality, but everyone understands equipment cannot deliver in isolation. Good quality raw materials, staff that genuinely takes pride in their work and the correct incentives are key to producing high quality products and services.
So why is there an expectation that data quality can be solved by tools alone?
At a minimum senior management support is required to push other departments to change their behaviour and/or values. So why aren’t senior management convinced that data quality is a problem worth their attention the way product & service quality is?
The fact that poor data quality has a high cost is reasonably well known via anecdotes. However, cost has not been well quantified, and hence fails to grab the attention of senior management. A 2005 paper by Richard Marsh[i] states: “Research and reports by industry experts, including Gartner Group, PriceWaterhouseCoopers and The Data Warehousing Institute clearly identify a crisis in data quality management and a reluctance among senior decision makers to do enough about it.” Little has changed since 2005.
However, we are living in a world where data generation, processing and consumption are increasing exponentially. With all the hype and investment in data, we face the grim prospect of fully embracing an age of data-driven-everything founded on a very poor quality raw material. Data quality is expected to be applied after generation, during the analytic phase. How much will that cost us? In order to function effectively, our new data-driven world must have high quality data running through every system and activity in an organization.
The Total Data Quality Movement is long overdue.
Only when every person in every organization understands the value of the data, do we have a chance of collectively solving the problem of poor data quality. Data quality must be considered from data generation, through transactional processing and analysis right until the point of archiving.
Informatica DQ supports IT departments in automating data correction where possible, and highlighting poor data for further attention where automation is not possible. MDM plays an important role in sustaining high quality data. Informatica tools empower the business to share the responsibility for total data quality.
We are ready for Total Data Quality, but continue to await the Total Data Quality Movement to get off the ground.
(If you do not have time to waiting for TDQM to gain traction, we can help you measure the cost of poor quality data in your organization to win corporate buy-in now.)
That second question is a killer because most people — no matter if they’re in marketing, sales or manufacturing — rely on incomplete, inaccurate or just plain wrong information. Regardless of industry, we’ve been fixated on historic transactions because that’s what our systems are designed to provide us.
“Moneyball: The Art of Winning an Unfair Game” gives a great example of what I mean. The book (not the movie) describes Billy Beane hiring MBAs to map out the factors that would win a baseball game. They discovered something completely unexpected: That getting more batters on base would tire out pitchers. It didn’t matter if batters had multi-base hits, and it didn’t even matter if they walked. What mattered was forcing pitchers to throw ball after ball as they faced an unrelenting string of batters. Beane stopped looking at RBIs, ERAs and even home runs, and started hiring batters who consistently reached first base. To me, the book illustrates that the most useful knowledge isn’t always what we’ve been programmed to depend on or what is delivered to us via one app or another.
For years, people across industries have turned to ERP, CRM and web analytics systems to forecast sales and acquire new customers. By their nature, such systems are transactional, forcing us to rely on history as the best predictor of the future. Sure it might be helpful for retailers to identify last year’s biggest customers, but that doesn’t tell them whose blogs, posts or Tweets influenced additional sales. Isn’t it time for all businesses, regardless of industry, to adopt a different point of view — one that we at Informatica call “Data-First”? Instead of relying solely on transactions, a data-first POV shines a light on interactions. It’s like having a high knowledge IQ about relationships and connections that matter.
A data-first POV changes everything. With it, companies can unleash the killer app, the killer sales organization and the killer marketing campaign. Imagine, for example, if a sales person meeting a new customer knew that person’s concerns, interests and business connections ahead of time? Couldn’t that knowledge — gleaned from Tweets, blogs, LinkedIn connections, online posts and transactional data — provide a window into the problems the prospect wants to solve?
That’s the premise of two startups I know about, and it illustrates how a data-first POV can fuel innovation for developers and their customers. Today, we’re awash in data-fueled things that are somehow attached to the Internet. Our cars, phones, thermostats and even our wristbands are generating and gleaning data in new and exciting ways. That’s knowledge begging to be put to good use. The winners will be the ones who figure out that knowledge truly is power, and wield that power to their advantage.
A mid-sized insurer recently approached our team for help. They wanted to understand how they fell short in making their case to their executives. Specifically, they proposed that fixing their customer data was key to supporting the executive team’s highly aggressive 3-year growth plan. (This plan was 3x today’s revenue). Given this core organizational mission – aside from being a warm and fuzzy place to work supporting its local community – the slam dunk solution to help here is simple. Just reducing the data migration effort around the next acquisition or avoiding the ritual annual, one-off data clean-up project already pays for any tool set enhancing data acquisitions, integration and hygiene. Will it get you to 3x today’s revenue? It probably won’t. What will help are the following:
Hard cost avoidance via software maintenance or consulting elimination is the easy part of the exercise. That is why CFOs love it and focus so much on it. It is easy to grasp and immediate (aka next quarter).
Soft cost reduction, like staff redundancies are a bit harder. Despite them being viable, in my experience very few decision makers want work on a business case to lay off staff. My team had one so far. They look at these savings as freed up capacity, which can be re-deployed more productively. Productivity is also a bit harder to quantify as you typically have to understand how data travels and gets worked on between departments.
However, revenue effects are even harder and esoteric to many people as they include projections. They are often considered “soft” benefits, although they outweigh the other areas by 2-3 times in terms of impact. Ultimately, every organization runs their strategy based on projections (see the insurer in my first paragraph).
The hardest to quantify is risk. Not only is it based on projections – often from a third party (Moody’s, TransUnion, etc.) – but few people understand it. More often, clients don’t even accept you investigating this area if you don’t have an advanced degree in insurance math. Nevertheless, risk can generate extra “soft” cost avoidance (beefing up reserve account balance creating opportunity cost) but also revenue (realizing a risk premium previously ignored). Often risk profiles change due to relationships, which can be links to new “horizontal” information (transactional attributes) or vertical (hierarchical) from parent-child relationships of an entity and the parent’s or children’s transactions.
Given the above, my initial advice to the insurer would be to look at the heartache of their last acquisition, use a benchmark for IT productivity from improved data management capabilities (typically 20-26% – Yankee Group) and there you go. This is just the IT side so consider increasing the upper range by 1.4x (Harvard Business School) as every attribute change (last mobile view date) requires additional meetings on a manager, director and VP level. These people’s time gets increasingly more expensive. You could also use Aberdeen’s benchmark of 13hrs per average master data attribute fix instead.
You can also look at productivity areas, which are typically overly measured. Let’s assume a call center rep spends 20% of the average call time of 12 minutes (depending on the call type – account or bill inquiry, dispute, etc.) understanding
- Who the customer is
- What he bought online and in-store
- If he tried to resolve his issue on the website or store
- How he uses equipment
- What he cares about
- If he prefers call backs, SMS or email confirmations
- His response rate to offers
- His/her value to the company
If he spends these 20% of every call stringing together insights from five applications and twelve screens instead of one frame in seconds, which is the same information in every application he touches, you just freed up 20% worth of his hourly compensation.
Then look at the software, hardware, maintenance and ongoing management of the likely customer record sources (pick the worst and best quality one based on your current understanding), which will end up in a centrally governed instance. Per DAMA, every duplicate record will cost you between $0.45 (party) and $0.85 (product) per transaction (edit touch). At the very least each record will be touched once a year (likely 3-5 times), so multiply your duplicated record count by that and you have your savings from just de-duplication. You can also use Aberdeen’s benchmark of 71 serious errors per 1,000 records, meaning the chance of transactional failure and required effort (% of one or more FTE’s daily workday) to fix is high. If this does not work for you, run a data profile with one of the many tools out there.
If standardization of records (zip codes, billing codes, currency, etc.) is the problem, ask your business partner how many customer contacts (calls, mailing, emails, orders, invoices or account statements) fail outright and/or require validation because of these attributes. Once again, if you apply the productivity gains mentioned earlier, there are you savings. If you look at the number of orders that get delayed in form of payment or revenue recognition and the average order amount by a week or a month, you were just able to quantify how much profit (multiply by operating margin) you would be able to pull into the current financial year from the next one.
The same is true for speeding up the introduction or a new product or a change to it generating profits earlier. Note that looking at the time value of funds realized earlier is too small in most instances especially in the current interest environment.
If emails bounce back or snail mail gets returned (no such address, no such name at this address, no such domain, no such user at this domain), e(mail) verification tools can help reduce the bounces. If every mail piece (forget email due to the miniscule cost) costs $1.25 – and this will vary by type of mailing (catalog, promotion post card, statement letter), incorrect or incomplete records are wasted cost. If you can, use fully loaded print cost incl. 3rd party data prep and returns handling. You will never capture all cost inputs but take a conservative stab.
If it was an offer, reduced bounces should also improve your response rate (also true for email now). Prospect mail response rates are typically around 1.2% (Direct Marketing Association), whereas phone response rates are around 8.2%. If you know that your current response rate is half that (for argument sake) and you send out 100,000 emails of which 1.3% (Silverpop) have customer data issues, then fixing 81-93% of them (our experience) will drop the bounce rate to under 0.3% meaning more emails will arrive/be relevant. This in turn multiplied by a standard conversion rate (MarketingSherpa) of 3% (industry and channel specific) and average order (your data) multiplied by operating margin gets you a benefit value for revenue.
If product data and inventory carrying cost or supplier spend are your issue, find out how many supplier shipments you receive every month, the average cost of a part (or cost range), apply the Aberdeen master data failure rate (71 in 1,000) to use cases around lack of or incorrect supersession or alternate part data, to assess the value of a single shipment’s overspend. You can also just use the ending inventory amount from the 10-k report and apply 3-10% improvement (Aberdeen) in a top-down approach. Alternatively, apply 3.2-4.9% to your annual supplier spend (KPMG).
You could also investigate the expediting or return cost of shipments in a period due to incorrectly aggregated customer forecasts, wrong or incomplete product information or wrong shipment instructions in a product or location profile. Apply Aberdeen’s 5% improvement rate and there you go.
Consider that a North American utility told us that just fixing their 200 Tier1 suppliers’ product information achieved an increase in discounts from $14 to $120 million. They also found that fixing one basic out of sixty attributes in one part category saves them over $200,000 annually.
So what ROI percentages would you find tolerable or justifiable for, say an EDW project, a CRM project, a new claims system, etc.? What would the annual savings or new revenue be that you were comfortable with? What was the craziest improvement you have seen coming to fruition, which nobody expected?
Next time, I will add some more “use cases” to the list and look at some philosophical implications of averages.
- It’s difficult to find and retain resource skills to staff big data projects
- It takes too long to deploy Big Data projects from ‘proof-of-concept’ to production
- Big data technologies are evolving too quickly to adapt
- Big Data projects fail to deliver the expected value
- It’s difficult to make Big Data fit-for-purpose, assess trust, and ensure security
Informatica has extended its leadership in data integration and data quality to Hadoop with our Big Data Edition to address all of these Big Data challenges.
The biggest challenge companies’ face is finding and retaining Big Data resource skills to staff their Big Data projects. One large global bank started their first Big Data project with 5 Java developers but as their Big Data initiative gained momentum they needed to hire 25 more Java developers that year. They quickly realized that while they had scaled their infrastructure to store and process massive volumes of data they could not scale the necessary resource skills to implement their Big Data projects. The research mentioned earlier indicates that 80% of the work in a Big Data project relates to data integration and data quality. With Informatica you can staff Big Data projects with readily available Informatica developers instead of an army of developers hand-coding in Java and other Hadoop programming languages. In addition, we’ve proven to our customers that Informatica developers are up to 5 times more productive on Hadoop than hand-coding and they don’t need to know how to program on Hadoop. A large Fortune 100 global manufacturer needed to hire 40 data scientists for their Big Data initiative. Do you really want these hard-to-find and expensive resources spending 80% of their time integrating and preparing data?
Another key challenge is that it takes too long to deploy Big Data projects to production. One of our Big Data Media and Entertainment customers told me prior to purchasing the Informatica Big Data Edition that most of his Big Data projects had failed. Naturally, I asked him why they had failed. His response was, “We have these hot-shot Java developers with a good idea which they prove out in our sandbox environment. But then when it comes time to deploy it to production they have to re-work a lot of code to make it perform and scale, make it highly available 24×7, have robust error-handling, and integrate with the rest of our production infrastructure. In addition, it is very difficult to maintain as things change. This results in project delays and cost overruns.” With Informatica, you can automate the entire data integration and data quality pipeline; everything you build in the development sandbox environment can be immediately and automatically deployed and scheduled for production as enterprise ready. Performance, scalability, and reliability are simply handled through configuration parameters without having to re-build or re-work any development which is typical with hand-coding. And Informatica makes it easier to reuse existing work and maintain Big Data projects as things change. The Big Data Editions is built on Vibe our virtual data machine and provides near universal connectivity so that you can quickly onboard new types of data of any volume and at any speed.
Big Data technologies are emerging and evolving extremely fast. This in turn becomes a barrier to innovation since these technologies evolve much too quickly for most organizations to adopt before the next big thing comes along. What if you place the wrong technology bet and find that it is obsolete before you barely get started? Hadoop is gaining tremendous adoption but it has evolved along with other big data technologies where there are literally hundreds of open source projects and commercial vendors in the Big Data landscape. Informatica is built on the Vibe virtual data machine which means that everything you built yesterday and build today can be deployed on the major big data technologies of tomorrow. Today it is five flavors of Hadoop but tomorrow it could be Hadoop and other technology platforms. One of our Big Data Edition customers, stated after purchasing the product that Informatica Big Data Edition with Vibe is our insurance policy to insulate our Big Data projects from changing technologies. In fact, existing Informatica customers can take PowerCenter mappings they built years ago, import them into the Big Data Edition and can run on Hadoop in many cases with minimal changes and effort.
Another complaint of business is that Big Data projects fail to deliver the expected value. In a recent survey (1), 86% Marketers say they could generate more revenue if they had a more complete picture of customers. We all know that the cost of us selling a product to an existing customer is only about 10 percent of selling the same product to a new customer. But, it’s not easy to cross-sell and up-sell to existing customers. Customer Relationship Management (CRM) initiatives help to address these challenges but they too often fail to deliver the expected business value. The impact is low marketing ROI, poor customer experience, customer churn, and missed sales opportunities. By using Informatica’s Big Data Edition with Master Data Management (MDM) to enrich customer master data with Big Data insights you can create a single, complete, view of customers that yields tremendous results. We call this real-time customer analytics and Informatica’s solution improves total customer experience by turning Big Data into actionable information so you can proactively engage with customers in real-time. For example, this solution enables customer service to know which customers are likely to churn in the next two weeks so they can take the next best action or in the case of sales and marketing determine next best offers based on customer online behavior to increase cross-sell and up-sell conversions.
Chief Data Officers and their analytics team find it difficult to make Big Data fit-for-purpose, assess trust, and ensure security. According to the business consulting firm Booz Allen Hamilton, “At some organizations, analysts may spend as much as 80 percent of their time preparing the data, leaving just 20 percent for conducting actual analysis” (2). This is not an efficient or effective way to use highly skilled and expensive data science and data management resource skills. They should be spending most of their time analyzing data and discovering valuable insights. The result of all this is project delays, cost overruns, and missed opportunities. The Informatica Intelligent Data platform supports a managed data lake as a single place to manage the supply and demand of data and converts raw big data into fit-for-purpose, trusted, and secure information. Think of this as a Big Data supply chain to collect, refine, govern, deliver, and manage your data assets so your analytics team can easily find, access, integrate and trust your data in a secure and automated fashion.
If you are embarking on a Big Data journey I encourage you to contact Informatica for a Big Data readiness assessment to ensure your success and avoid the pitfalls of the top 5 Big Data challenges.
- Gleanster Survey of 100 senior level marketers. The title of this survey is, Lifecycle Engagement: Imperatives for Midsize and Large Companies. Sponsored by YesMail.
- “The Data Lake: Take Big Data Beyond the Cloud”, Booz Allen Hamilton, 2013
Recently, I had the opportunity to interview half dozen CIOs and half dozen CFOs. Kind of like a marriage therapist, I got to see each party’s story about the relationship. CFOs, in particular, felt that the quality of the relationship could impact their businesses’ success. Armed with this knowledge, I wanted to see if I could help each leader build a better working relationship. Previously, I let CIO’s know about the emergence and significance of the strategic CFO. In today’s post, l will start by sharing the CIOs perspective on the CFO relationship and then I will discuss how CFOs can build better CIO relationships.
CIOs feel under the gun these days!
If you don’t know, CIOs feel under the gun these days. CIOs see their enterprises demanding ubiquitous computing. Users want to use their apps and expect corporate apps to look like their personal apps such as Facebook. They want to bring their own preferred devices. Most of all, , they want all their data on any device when they need it. This means CIOs are trying to manage a changing technical landscape of mobile, cloud, social, and big data. These are all vying for both dollars and attention. As a result, CIOs see their role in a sea change. Today, they need to focus less on building things and more on managing vendors. CIOs say that they need to 1) better connect what IT is doing to support the business strategy; 2) improve technical orchestration; and 3) improve process excellence. This is a big and growing charter.
CIOs see the CFO conversation being just about the numbers
CIOs worry that you don’t understand how many things are now being run by IT and that historical percentages of revenue may no longer appropriate. Think about healthcare, which used to be a complete laggard in technology but today it is having everything digitalized. Even a digital thermometer plugs into an iPad so it directly communicates with a patient record. The world has clearly changed. And CIOs worry that you view IT as merely a cost center and that you do not see the value generated through IT investment or the asset that information provides to business decision makers. However, the good news is that I believe that a different type of discussion is possible. And that CFOs have the opportunity to play an important role in helping to shape the value that CIOs deliver to the business.
CFOs should share their experience and business knowledge
CFOs that I talked to said that they believe the CFO/CIO relationship needs to be complimentary and that the roles have the most concentric rings. These CFOs believe that the stronger the relationship the better it is for their business. One area that you can help the CIO is in sharing your knowledge of the business and business needs. CIOs are trying to get closer to the business and you can help build this linkage and to support requests that come out of this process. Clearly, an aligned CFO can be “one of the biggest advocates of the CIO”. Given this, make sure that you are on your CIOs Investment Committee.
Tell your CIO about your data pains
CFOs need to be good customers too. CFOs that I talked to told me that they know their business has “a data issue”. They worry about the integrity of data from the source. CFOs see their role as relying increasingly on timely, accurate data. They, also, know they have disparate systems and too much manual stuff going on in the back office. For them, integration needs to exist from the frontend to the backend. Their teams personally feel the large number of manual steps.
For this reasons, CFOs, we talked to, believe that the integration of data is a big issue whether they are in a small or large business. Have you talked to your CIO about data integration or quality projects to change the ugliness that you have to live with day in day out? It will make you and the business more efficient. One CFO was blunt here saying “making life easier is all about the systems. If the systems suck then you cannot trust the numbers when you get them. You want to access the numbers easily, timely, and accurately. You want to make easier to forecast so you can set expectations with the business and externally”.
At the same time, CFOs that I talked to worried about the quality of financial and business data analysis. Once he had data, he worried about being able to analyze information effectively. Increasingly, CFOs say that they need to help drive synergies across their businesses. At the same time, CFOs increasingly need to manage upward with information. They want information for decision makers so they can make better decisions.
Changing the CIO Dialog
So it is clear that CFOs like you see data as a competitive advantage in particular financial data. The question is, as your unofficial therapist, why aren’t you having a discussion with your CIO not just about the numbers or financial justification for this or that system and instead, asking about the+ integration investment that can make your integration problems go away.
The strategic CFO is different than the “1975 Controller CFO”
Traditionally, CIOs have tended to work with what one CIO called a “1975 Controller CFO”. For this reason, the relationship between CIOs and CFOs was expressed well in a single word “contentious”. But a new type of CFO is emerging that offers the potential of different type of relationship. These so called “strategic CFOs” can be an effective ally for CIOs. The question is which type of CFO do you have? In this post, I will provide you with a bit of a litmus test so you can determine what type of CFO you have but more importantly, I will share how you can take maximum advantage of having a strategic-oriented CFO relationship. But first let’s hear a bit more of the CIOs reactions to CFOs.
Views of CIOs according to CIO interviews
Clearly, “the relationship…with these CFOs is filled with friction”. Controller CFOs “do not get why so many things require IT these days. They think that things must be out of whack. One CIO said that they think technology should only cost 2-3% of revenue while it can easily reach 8-9% of revenue these days.” Another CIO complained by saying their discussion with a Controller CFOs is only about IT productivity and effectiveness. In their eyes, this has limited the topics of discussion to IT cost reduction, IT produced business savings, and the soundness of the current IT organization. Unfortunately, this CIO believe that Controller CFOs are not concerned with creating business value or sees information as an asset. Instead, they view IT as a cost center. Another CIO says Controller CFOs are just about the numbers and see the CIO role as being about signing checks. It is a classic “demand versus supply” issue. At the same times, CIOs say that they see reporting to Controller CFO as a narrowing function. As well, they believe it signals to the rest of the organization “that IT is not strategic and less important than other business functions”.
What then is this strategic CFO?
In contrast to their controller peers, strategic CFOs often have a broader business background than their accounting and a CPA peers. Many have, also, pursued an MBA. Some have public accounting experience. Others yet come from professions like legal, business development, or investment banking.
More important than where they came from, strategic CFOs see a world that is about more than just numbers. They want to be more externally facing and to understand their company’s businesses. They tend to focus as much on what is going to happen as they do on what has happened. Remember, financial accounting is backward facing. Given this, strategic CFOs spend a lot of time trying to understand what is going on in their firm’s businesses. One strategic CFO said that they do this so they can contribute and add value—I want to be a true business leader. And taking this posture often puts them in the top three decision makers for their business. There may be lessons in this posture for technology focused CIOs.
Why is a strategic CFO such a game changer for CIO?
One CIO put it this way. “If you have a modern day CFO, then they are an enabler of IT”. Strategic CFO’s agree. Strategic CFOs themselves as having the “the most concentric circles with the CIO”. They believe that they need “CIOs more than ever to extract data to do their jobs better and to provide the management information business leadership needs to make better business decisions”. At the same time, the perspective of a strategic CFO can be valuable to the CIO because they have good working knowledge of what the business wants. They, also, tend to be close to the management information systems and computer systems. CFOs typically understand the needs of the business better than most staff functions. The CFOs, therefore, can be the biggest advocate of the CIO. This is why strategic CFOs should be on the CIOs Investment Committee. Finally, a strategic CFO can help a CIO ensure their technology selections meet affordability targets and are compliant with the corporate strategy.
Are the priorities of a strategic CFO different?
Strategic CFOs still care P&L, Expense Management, Budgetary Control, Compliance, and Risk Management. But they are also concerned about performance management for the enterprise as whole and senior management reporting. As well they, they want to do the above tasks faster so finance and other functions can do in period management by exception. For this reason they see data and data analysis as a big issue.
Strategic CFOs care about data integration
In interviews of strategic CFOs, I saw a group of people that truly understand the data holes in the current IT system. And they intuit firsthand the value proposition of investing to fix things here. These CFOs say that they worry “about the integrity of data from the source and about being able to analyze information”. They say that they want the integration to be good enough that at the push of button they can get an accurate report. Otherwise, they have to “massage the data and then send it through another system to get what you need”.
These CFOs say that they really feel the pain of systems not talking to each other. They understand this means making disparate systems from the frontend to the backend talk to one another. But they, also, believe that making things less manual will drive important consequences including their own ability to inspect books more frequently. Given this, they see data as a competitive advantage. One CFO even said that they thought data is the last competitive advantage.
Strategic CFOs are also worried about data security. They believe their auditors are going after this with a vengeance. They are really worried about getting hacked. One said, “Target scared a lot of folks and was to many respects a watershed event”. At the same time, Strategic CFOs want to be able to drive synergies across the business. One CFO even extolled the value of a holistic view of customer. When I asked why this was a finance objective versus a marketing objective, they said finance is responsible for business metrics and we have gaps in our business metrics around customer including the percentage of cross sell is taking place between our business units. Another CFO amplified on this theme by saying that “increasingly we need to manage upward with information. For this reason, we need information for decision makers so they can make better decisions”. Another strategic CFO summed this up by saying “the integration of the right systems to provide the right information needs to be done so we and the business have the right information to manage and make decisions at the right time”.
So what are you waiting for?
If you are lucky enough to have a Strategic CFO, start building your relationship. And you can start by discussing their data integration and data quality problems. So I have a question for you. How many of you think you have a Controller CFO versus a Strategic CFO? Please share here.
In my last blog, I talked about the dreadful experience of cleaning raw data by hand as a former analyst a few years back. Well, the truth is, I was not alone. At a recent data mining Meetup event in San Francisco bay area, I asked a few analysts: “How much time do you spend on cleaning your data at work?” “More than 80% of my time” and “most my days” said the analysts, and “they are not fun”.
But check this out: There are over a dozen Meetup groups focused on data science and data mining here in the bay area I live. Those groups put on events multiple times a month, with topics often around hot, emerging technologies such as machine learning, graph analysis, real-time analytics, new algorithm on analyzing social media data, and of course, anything Big Data. Cools BI tools, new programming models and algorithms for better analysis are a big draw to data practitioners these days.
That got me thinking… if what analysts said to me is true, i.e., they spent 80% of their time on data prepping and 1/4 of that time analyzing the data and visualizing the results, which BTW, “is actually fun”, quoting a data analyst, then why are they drawn to the events focused on discussing the tools that can only help them 20% of the time? Why wouldn’t they want to explore technologies that can help address the dreadful 80% of the data scrubbing task they complain about?
Having been there myself, I thought perhaps a little self-reflection would help answer the question.
As a student of math, I love data and am fascinated about good stories I can discover from them. My two-year math program in graduate school was primarily focused on learning how to build fabulous math models to simulate the real events, and use those formula to predict the future, or look for meaningful patterns.
I used BI and statistical analysis tools while at school, and continued to use them at work after I graduated. Those software were great in that they helped me get to the results and see what’s in my data, and I can develop conclusions and make recommendations based on those insights for my clients. Without BI and visualization tools, I would not have delivered any results.
That was fun and glamorous part of my job as an analyst, but when I was not creating nice charts and presentations to tell the stories in my data, I was spending time, great amount of time, sometimes up to the wee hours cleaning and verifying my data, I was convinced that was part of my job and I just had to suck it up.
It was only a few months ago that I stumbled upon data quality software – it happened when I joined Informatica. At first I thought they were talking to the wrong person when they started pitching me data quality solutions.
Turns out, the concept of data quality automation is a highly relevant and extremely intuitive subject to me, and for anyone who is dealing with data on the regular basis. Data quality software offers an automated process for data cleansing and is much faster and delivers more accurate results than manual process. To put that in math context, if a data quality tool can reduce the data cleansing effort from 80% to 40% (btw, this is hardly a random number, some of our customers have reported much better results), that means analysts can now free up 40% of their time from scrubbing data, and use that times to do the things they like – playing with data in BI tools, building new models or running more scenarios, producing different views of the data and discovering things they may not be able to before, and do all of that with clean, trusted data. No more bored to death experience, what they are left with are improved productivity, more accurate and consistent results, compelling stories about data, and most important, they can focus on doing the things they like! Not too shabby right?
I am excited about trying out the data quality tools we have here at Informtica, my fellow analysts, you should start looking into them also. And I will check back in soon with more stories to share..
About 15 or so years ago, some friends of mine called me to share great news. Their dating relationship had become serious and they were headed toward marriage. After a romantic proposal and a beautiful ring, it was time to plan the wedding and invite the guests.
This exciting time was confounded by a significant challenge. Though they were very much in love, one of them had an incredibly tough time making wise financial choices. During the wedding planning process, the financially astute fiancée grew concerned about the problems the challenged partner could bring. Even though the financially illiterate fiancée had every other admirable quality, the finance issue nearly created enough doubt to end the engagement. Fortunately, my friends moved forward with the ceremony, were married and immediately went to work on learning new healthy financial habits as a couple.
Let’s segue into how this relates to telecommunications and data, specifically to your average communications operator. Just like a concerned fiancée, you’d think twice about making a commitment to an organization that didn’t have a strong foundation.
Like the financially challenged fiancée, the average operator has a number of excellent qualities: functioning business model, great branding, international roaming, creative ads, long-term prospects, smart people at the helm and all the data and IT assets you can imagine. Unfortunately, despite the externally visible bells and whistles, over time they tend to lose operational soundness around the basics. Specifically, their lack of data quality causes them to forfeit an ever increasing amount of billing revenue. Their poor data costs them millions each year.
A recent set of engagements highlighted this phenomenon. The small carrier (3-6 million subscribers) who implements a more consistent, unique way to manage core subscriber profile and product data could recover underbilling of $6.9 million annually. A larger carrier (10-20 million subscribers) could recover $28.1 million every year from fixing billing errors. (This doesn’t even cover the large Indian and Chinese carriers who have over 100 million customers!)
Typically, a billing error starts with an incorrect set up of a service line item base price and related 30+ discount line variances. Next, the wrong service discount item is applied at contract start. If that did not happen (or on top of those), it will occur when the customer calls in during or right before the end of the first contract period (12-24 months) to complain about the service quality, bill shock, etc. Here, the call center rep will break an existing triple play bundle by deleting an item and setting up a separate non-bundle service line item at a lower price (higher discount). The head of billing actually told us, “our reps just give a residential subscriber a discount of $2 for calling us”. It’s even higher for commercial clients.
To make matters worse, this change will trigger misaligned (incorrect) activation dates or even bill duplication, all of which will have to be fixed later by multiple staff on the BSS and OSS side or may even trigger an investigation project by the revenue assurance department. Worst case, the deletion of the item from the bundle (especially for B2B clients) will not terminate the wholesale cost the carrier still owes a national carrier for a broadband line, which often is 1/3 of the retail price for a business customer.
To come full circle to my initial “accounting challenged” example; would you marry (invest in) this organization? Do you think this can or should be solved in a big bang approach or incrementally? Where would you start: product management, the service center, residential or commercial customers?
Observations and illustrations contained in this post are estimates only and are based entirely upon information provided by the prospective customer and on our observations and benchmarks. While we believe our recommendations and estimates to be sound, the degree of success achieved by the prospective customer is dependent upon a variety of factors, many of which are not under Informatica’s control and nothing in this post shall be relied upon as representative of the degree of success that may, in fact, be realized and no warranty or representation of success, either express or implied, is made.
Did I really compare data quality to flushing toilet paper? Yeah, I think I did. Makes me laugh when I read that, but still true. And yes, I am still playing with more data. This time it’s a location schedule for earthquake risk. I see a 26-story structure with a building value of only $136,000 built in who knows what year. I’d pull my hair out if it weren’t already shaved off.
So let’s talk about the six steps for data quality competency in underwriting. These six steps are standard in the enterprise. But, what we will discuss is how to tackle these in insurance underwriting. And more importantly, what is the business impact to effective adoption of the competency. It’s a repeating self-reinforcing cycle. And when done correctly can be intelligent and adaptive to changing business needs.
Profile – Effectively profile and discover data from multiple sources
We’ll start at the beginning, a very good place to start. First you need to understand your data. Where is it from and in what shape does it come? Whether internal or external sources, the profile step will help identify the problem areas. In underwriting, this will involve a lot of external submission data from brokers and MGAs. This is then combined with internal and service bureau data to get a full picture of the risk. Identify you key data points for underwriting and a desired state for that data. Once the data is profiled, you’ll get a very good sense of where your troubles are. And continually profile as you bring other sources online using the same standards of measurement. As a side, this will also help in remediating brokers that are not meeting the standard.
Measure – Establish data quality metrics and targets
As an underwriter you will need to determine what is the quality bar for the data you use. Usually this means flagging your most critical data fields for meeting underwriting guidelines. See where you are and where you want to be. Determine how you will measure the quality of the data as well as desired state. And by the way, actuarial and risk will likely do the same thing on the same or similar data. Over time it all comes together as a team.
Design – Quickly build comprehensive data quality rules
This is the meaty part of the cycle, and fun to boot. First look to your desired future state and your critical underwriting fields. For each one, determine the rules by which you normally fix errant data. Like what you do when you see a 30-story wood frame structure? How do you validate, cleanse and remediate that discrepancy? This may involve fuzzy logic or supporting data lookups, and can easily be captured. Do this, write it down, and catalog it to be codified in your data quality tool. As you go along you will see a growing library of data quality rules being compiled for broad use.
Deploy – Native data quality services across the enterprise
Once these rules are compiled and tested, they can be deployed for reuse in the organization. This is the beautiful magical thing that happens. Your institutional knowledge of your underwriting criteria can be captured and reused. This doesn’t mean just once, but reused to cleanse existing data, new data and everything going forward. Your analysts will love you, your actuaries and risk modelers will love you; you will be a hero.
Review – Assess performance against goals
Remember those goals you set for your quality when you started? Check and see how you’re doing. After a few weeks and months, you should be able to profile the data, run the reports and see that the needle will have moved. Remember that as part of the self-reinforcing cycle, you can now identify new issues to tackle and adjust those that aren’t working. One metric that you’ll want to measure over time is the increase of higher quote flow, better productivity and more competitive premium pricing.
Monitor – Proactively address critical issues
Now monitor constantly. As you bring new MGAs online, receive new underwriting guidelines or launch into new lines of business you will repeat this cycle. You will also utilize the same rule set as portfolios are acquired. It becomes a good way to sanity check the acquisition of business against your quality standards.
In case it wasn’t apparent your data quality plan is now more automated. With few manual exceptions you should not have to be remediating data the way you were in the past. In each of these steps there is obvious business value. In the end, it all adds up to better risk/cat modeling, more accurate risk pricing, cleaner data (for everyone in the organization) and more time doing the core business of underwriting. Imagine if you can increase your quote volume simply by not needing to muck around in data. Imagine if you can improve your quote to bind ratio through better quality data and pricing. The last time I checked, that’s just good insurance business.
And now for something completely different…cats on pianos. No, just kidding. But check here to learn more about Informatica’s insurance initiatives.
The growth of big data drives many things, including the use of cloud-based resources, the growth of non-traditional databases, and, of course, the growth of data integration. What’s typically not as well understood are the required patterns of data integration, or, the ongoing need for better and more innovative data cleansing tools.
Indeed, while writing Big Data@Work: Dispelling the Myths, Uncovering the Opportunities, Tom Davenport observed data scientists at work. During his talk at VentureBeat’s DataBeat conference, Davenport said data scientists would need better data integration and data cleansing tools before they’d be able to keep up with the demand within organizations.
But Davenport is not alone. Most who deploy big data systems see the need for data integration and data cleansing tools. In most instances, not having those tools in place hindered progress.
I would agree with Davenport, in that the number one impediment to moving to any type of big data is how to clean and move data. Addressing that aspect of big data is Job One for enterprise IT.
The fact is, just implementing Hadoop-based databases won’t make a big data system work. Indeed, the data must come from existing operational data stores, and leverage all types of interfaces and database models. The fundamental need to translate the data structure and content to effectively move from one data store (or stores, typically) to the big data systems has more complexities than most enterprises understand.
The path forward may require more steps than originally anticipated, and perhaps the whole big data thing was sold as something that’s much easier than it actually is. My role for the last few years is to be the guy who lets enterprises know that data integration and data cleansing are core components to the process of building and deploying big data systems. You may as well learn to deal with it early in the process.
The good news is that data integration is not a new concept, and the technology is more than mature. What’s more, data cleansing tools can now be a part of the data integration technology offerings, and actually clean the data as it moves from place to place, and do so in near real-time.
So, doing big data anytime soon? Now is the time to define your big data strategy, in terms of the new technology you’ll be dragging into the enterprise. It’s also time to expand or change the use of data integration and perhaps the enabling technology that is built or designed around the use of big data.
I hate to sound like broken record, but somebody has to say this stuff.