Tag Archives: Data Quality
- It’s difficult to find and retain resource skills to staff big data projects
- It takes too long to deploy Big Data projects from ‘proof-of-concept’ to production
- Big data technologies are evolving too quickly to adapt
- Big Data projects fail to deliver the expected value
- It’s difficult to make Big Data fit-for-purpose, assess trust, and ensure security
Informatica has extended its leadership in data integration and data quality to Hadoop with our Big Data Edition to address all of these Big Data challenges.
The biggest challenge companies’ face is finding and retaining Big Data resource skills to staff their Big Data projects. One large global bank started their first Big Data project with 5 Java developers but as their Big Data initiative gained momentum they needed to hire 25 more Java developers that year. They quickly realized that while they had scaled their infrastructure to store and process massive volumes of data they could not scale the necessary resource skills to implement their Big Data projects. The research mentioned earlier indicates that 80% of the work in a Big Data project relates to data integration and data quality. With Informatica you can staff Big Data projects with readily available Informatica developers instead of an army of developers hand-coding in Java and other Hadoop programming languages. In addition, we’ve proven to our customers that Informatica developers are up to 5 times more productive on Hadoop than hand-coding and they don’t need to know how to program on Hadoop. A large Fortune 100 global manufacturer needed to hire 40 data scientists for their Big Data initiative. Do you really want these hard-to-find and expensive resources spending 80% of their time integrating and preparing data?
Another key challenge is that it takes too long to deploy Big Data projects to production. One of our Big Data Media and Entertainment customers told me prior to purchasing the Informatica Big Data Edition that most of his Big Data projects had failed. Naturally, I asked him why they had failed. His response was, “We have these hot-shot Java developers with a good idea which they prove out in our sandbox environment. But then when it comes time to deploy it to production they have to re-work a lot of code to make it perform and scale, make it highly available 24×7, have robust error-handling, and integrate with the rest of our production infrastructure. In addition, it is very difficult to maintain as things change. This results in project delays and cost overruns.” With Informatica, you can automate the entire data integration and data quality pipeline; everything you build in the development sandbox environment can be immediately and automatically deployed and scheduled for production as enterprise ready. Performance, scalability, and reliability are simply handled through configuration parameters without having to re-build or re-work any development which is typical with hand-coding. And Informatica makes it easier to reuse existing work and maintain Big Data projects as things change. The Big Data Editions is built on Vibe our virtual data machine and provides near universal connectivity so that you can quickly onboard new types of data of any volume and at any speed.
Big Data technologies are emerging and evolving extremely fast. This in turn becomes a barrier to innovation since these technologies evolve much too quickly for most organizations to adopt before the next big thing comes along. What if you place the wrong technology bet and find that it is obsolete before you barely get started? Hadoop is gaining tremendous adoption but it has evolved along with other big data technologies where there are literally hundreds of open source projects and commercial vendors in the Big Data landscape. Informatica is built on the Vibe virtual data machine which means that everything you built yesterday and build today can be deployed on the major big data technologies of tomorrow. Today it is five flavors of Hadoop but tomorrow it could be Hadoop and other technology platforms. One of our Big Data Edition customers, stated after purchasing the product that Informatica Big Data Edition with Vibe is our insurance policy to insulate our Big Data projects from changing technologies. In fact, existing Informatica customers can take PowerCenter mappings they built years ago, import them into the Big Data Edition and can run on Hadoop in many cases with minimal changes and effort.
Another complaint of business is that Big Data projects fail to deliver the expected value. In a recent survey (1), 86% Marketers say they could generate more revenue if they had a more complete picture of customers. We all know that the cost of us selling a product to an existing customer is only about 10 percent of selling the same product to a new customer. But, it’s not easy to cross-sell and up-sell to existing customers. Customer Relationship Management (CRM) initiatives help to address these challenges but they too often fail to deliver the expected business value. The impact is low marketing ROI, poor customer experience, customer churn, and missed sales opportunities. By using Informatica’s Big Data Edition with Master Data Management (MDM) to enrich customer master data with Big Data insights you can create a single, complete, view of customers that yields tremendous results. We call this real-time customer analytics and Informatica’s solution improves total customer experience by turning Big Data into actionable information so you can proactively engage with customers in real-time. For example, this solution enables customer service to know which customers are likely to churn in the next two weeks so they can take the next best action or in the case of sales and marketing determine next best offers based on customer online behavior to increase cross-sell and up-sell conversions.
Chief Data Officers and their analytics team find it difficult to make Big Data fit-for-purpose, assess trust, and ensure security. According to the business consulting firm Booz Allen Hamilton, “At some organizations, analysts may spend as much as 80 percent of their time preparing the data, leaving just 20 percent for conducting actual analysis” (2). This is not an efficient or effective way to use highly skilled and expensive data science and data management resource skills. They should be spending most of their time analyzing data and discovering valuable insights. The result of all this is project delays, cost overruns, and missed opportunities. The Informatica Intelligent Data platform supports a managed data lake as a single place to manage the supply and demand of data and converts raw big data into fit-for-purpose, trusted, and secure information. Think of this as a Big Data supply chain to collect, refine, govern, deliver, and manage your data assets so your analytics team can easily find, access, integrate and trust your data in a secure and automated fashion.
If you are embarking on a Big Data journey I encourage you to contact Informatica for a Big Data readiness assessment to ensure your success and avoid the pitfalls of the top 5 Big Data challenges.
- Gleanster Survey of 100 senior level marketers. The title of this survey is, Lifecycle Engagement: Imperatives for Midsize and Large Companies. Sponsored by YesMail.
- “The Data Lake: Take Big Data Beyond the Cloud”, Booz Allen Hamilton, 2013
Recently, I had the opportunity to interview half dozen CIOs and half dozen CFOs. Kind of like a marriage therapist, I got to see each party’s story about the relationship. CFOs, in particular, felt that the quality of the relationship could impact their businesses’ success. Armed with this knowledge, I wanted to see if I could help each leader build a better working relationship. Previously, I let CIO’s know about the emergence and significance of the strategic CFO. In today’s post, l will start by sharing the CIOs perspective on the CFO relationship and then I will discuss how CFOs can build better CIO relationships.
CIOs feel under the gun these days!
If you don’t know, CIOs feel under the gun these days. CIOs see their enterprises demanding ubiquitous computing. Users want to use their apps and expect corporate apps to look like their personal apps such as Facebook. They want to bring their own preferred devices. Most of all, , they want all their data on any device when they need it. This means CIOs are trying to manage a changing technical landscape of mobile, cloud, social, and big data. These are all vying for both dollars and attention. As a result, CIOs see their role in a sea change. Today, they need to focus less on building things and more on managing vendors. CIOs say that they need to 1) better connect what IT is doing to support the business strategy; 2) improve technical orchestration; and 3) improve process excellence. This is a big and growing charter.
CIOs see the CFO conversation being just about the numbers
CIOs worry that you don’t understand how many things are now being run by IT and that historical percentages of revenue may no longer appropriate. Think about healthcare, which used to be a complete laggard in technology but today it is having everything digitalized. Even a digital thermometer plugs into an iPad so it directly communicates with a patient record. The world has clearly changed. And CIOs worry that you view IT as merely a cost center and that you do not see the value generated through IT investment or the asset that information provides to business decision makers. However, the good news is that I believe that a different type of discussion is possible. And that CFOs have the opportunity to play an important role in helping to shape the value that CIOs deliver to the business.
CFOs should share their experience and business knowledge
CFOs that I talked to said that they believe the CFO/CIO relationship needs to be complimentary and that the roles have the most concentric rings. These CFOs believe that the stronger the relationship the better it is for their business. One area that you can help the CIO is in sharing your knowledge of the business and business needs. CIOs are trying to get closer to the business and you can help build this linkage and to support requests that come out of this process. Clearly, an aligned CFO can be “one of the biggest advocates of the CIO”. Given this, make sure that you are on your CIOs Investment Committee.
Tell your CIO about your data pains
CFOs need to be good customers too. CFOs that I talked to told me that they know their business has “a data issue”. They worry about the integrity of data from the source. CFOs see their role as relying increasingly on timely, accurate data. They, also, know they have disparate systems and too much manual stuff going on in the back office. For them, integration needs to exist from the frontend to the backend. Their teams personally feel the large number of manual steps.
For this reasons, CFOs, we talked to, believe that the integration of data is a big issue whether they are in a small or large business. Have you talked to your CIO about data integration or quality projects to change the ugliness that you have to live with day in day out? It will make you and the business more efficient. One CFO was blunt here saying “making life easier is all about the systems. If the systems suck then you cannot trust the numbers when you get them. You want to access the numbers easily, timely, and accurately. You want to make easier to forecast so you can set expectations with the business and externally”.
At the same time, CFOs that I talked to worried about the quality of financial and business data analysis. Once he had data, he worried about being able to analyze information effectively. Increasingly, CFOs say that they need to help drive synergies across their businesses. At the same time, CFOs increasingly need to manage upward with information. They want information for decision makers so they can make better decisions.
Changing the CIO Dialog
So it is clear that CFOs like you see data as a competitive advantage in particular financial data. The question is, as your unofficial therapist, why aren’t you having a discussion with your CIO not just about the numbers or financial justification for this or that system and instead, asking about the+ integration investment that can make your integration problems go away.
The strategic CFO is different than the “1975 Controller CFO”
Traditionally, CIOs have tended to work with what one CIO called a “1975 Controller CFO”. For this reason, the relationship between CIOs and CFOs was expressed well in a single word “contentious”. But a new type of CFO is emerging that offers the potential of different type of relationship. These so called “strategic CFOs” can be an effective ally for CIOs. The question is which type of CFO do you have? In this post, I will provide you with a bit of a litmus test so you can determine what type of CFO you have but more importantly, I will share how you can take maximum advantage of having a strategic-oriented CFO relationship. But first let’s hear a bit more of the CIOs reactions to CFOs.
Views of CIOs according to CIO interviews
Clearly, “the relationship…with these CFOs is filled with friction”. Controller CFOs “do not get why so many things require IT these days. They think that things must be out of whack. One CIO said that they think technology should only cost 2-3% of revenue while it can easily reach 8-9% of revenue these days.” Another CIO complained by saying their discussion with a Controller CFOs is only about IT productivity and effectiveness. In their eyes, this has limited the topics of discussion to IT cost reduction, IT produced business savings, and the soundness of the current IT organization. Unfortunately, this CIO believe that Controller CFOs are not concerned with creating business value or sees information as an asset. Instead, they view IT as a cost center. Another CIO says Controller CFOs are just about the numbers and see the CIO role as being about signing checks. It is a classic “demand versus supply” issue. At the same times, CIOs say that they see reporting to Controller CFO as a narrowing function. As well, they believe it signals to the rest of the organization “that IT is not strategic and less important than other business functions”.
What then is this strategic CFO?
In contrast to their controller peers, strategic CFOs often have a broader business background than their accounting and a CPA peers. Many have, also, pursued an MBA. Some have public accounting experience. Others yet come from professions like legal, business development, or investment banking.
More important than where they came from, strategic CFOs see a world that is about more than just numbers. They want to be more externally facing and to understand their company’s businesses. They tend to focus as much on what is going to happen as they do on what has happened. Remember, financial accounting is backward facing. Given this, strategic CFOs spend a lot of time trying to understand what is going on in their firm’s businesses. One strategic CFO said that they do this so they can contribute and add value—I want to be a true business leader. And taking this posture often puts them in the top three decision makers for their business. There may be lessons in this posture for technology focused CIOs.
Why is a strategic CFO such a game changer for CIO?
One CIO put it this way. “If you have a modern day CFO, then they are an enabler of IT”. Strategic CFO’s agree. Strategic CFOs themselves as having the “the most concentric circles with the CIO”. They believe that they need “CIOs more than ever to extract data to do their jobs better and to provide the management information business leadership needs to make better business decisions”. At the same time, the perspective of a strategic CFO can be valuable to the CIO because they have good working knowledge of what the business wants. They, also, tend to be close to the management information systems and computer systems. CFOs typically understand the needs of the business better than most staff functions. The CFOs, therefore, can be the biggest advocate of the CIO. This is why strategic CFOs should be on the CIOs Investment Committee. Finally, a strategic CFO can help a CIO ensure their technology selections meet affordability targets and are compliant with the corporate strategy.
Are the priorities of a strategic CFO different?
Strategic CFOs still care P&L, Expense Management, Budgetary Control, Compliance, and Risk Management. But they are also concerned about performance management for the enterprise as whole and senior management reporting. As well they, they want to do the above tasks faster so finance and other functions can do in period management by exception. For this reason they see data and data analysis as a big issue.
Strategic CFOs care about data integration
In interviews of strategic CFOs, I saw a group of people that truly understand the data holes in the current IT system. And they intuit firsthand the value proposition of investing to fix things here. These CFOs say that they worry “about the integrity of data from the source and about being able to analyze information”. They say that they want the integration to be good enough that at the push of button they can get an accurate report. Otherwise, they have to “massage the data and then send it through another system to get what you need”.
These CFOs say that they really feel the pain of systems not talking to each other. They understand this means making disparate systems from the frontend to the backend talk to one another. But they, also, believe that making things less manual will drive important consequences including their own ability to inspect books more frequently. Given this, they see data as a competitive advantage. One CFO even said that they thought data is the last competitive advantage.
Strategic CFOs are also worried about data security. They believe their auditors are going after this with a vengeance. They are really worried about getting hacked. One said, “Target scared a lot of folks and was to many respects a watershed event”. At the same time, Strategic CFOs want to be able to drive synergies across the business. One CFO even extolled the value of a holistic view of customer. When I asked why this was a finance objective versus a marketing objective, they said finance is responsible for business metrics and we have gaps in our business metrics around customer including the percentage of cross sell is taking place between our business units. Another CFO amplified on this theme by saying that “increasingly we need to manage upward with information. For this reason, we need information for decision makers so they can make better decisions”. Another strategic CFO summed this up by saying “the integration of the right systems to provide the right information needs to be done so we and the business have the right information to manage and make decisions at the right time”.
So what are you waiting for?
If you are lucky enough to have a Strategic CFO, start building your relationship. And you can start by discussing their data integration and data quality problems. So I have a question for you. How many of you think you have a Controller CFO versus a Strategic CFO? Please share here.
In my last blog, I talked about the dreadful experience of cleaning raw data by hand as a former analyst a few years back. Well, the truth is, I was not alone. At a recent data mining Meetup event in San Francisco bay area, I asked a few analysts: “How much time do you spend on cleaning your data at work?” “More than 80% of my time” and “most my days” said the analysts, and “they are not fun”.
But check this out: There are over a dozen Meetup groups focused on data science and data mining here in the bay area I live. Those groups put on events multiple times a month, with topics often around hot, emerging technologies such as machine learning, graph analysis, real-time analytics, new algorithm on analyzing social media data, and of course, anything Big Data. Cools BI tools, new programming models and algorithms for better analysis are a big draw to data practitioners these days.
That got me thinking… if what analysts said to me is true, i.e., they spent 80% of their time on data prepping and 1/4 of that time analyzing the data and visualizing the results, which BTW, “is actually fun”, quoting a data analyst, then why are they drawn to the events focused on discussing the tools that can only help them 20% of the time? Why wouldn’t they want to explore technologies that can help address the dreadful 80% of the data scrubbing task they complain about?
Having been there myself, I thought perhaps a little self-reflection would help answer the question.
As a student of math, I love data and am fascinated about good stories I can discover from them. My two-year math program in graduate school was primarily focused on learning how to build fabulous math models to simulate the real events, and use those formula to predict the future, or look for meaningful patterns.
I used BI and statistical analysis tools while at school, and continued to use them at work after I graduated. Those software were great in that they helped me get to the results and see what’s in my data, and I can develop conclusions and make recommendations based on those insights for my clients. Without BI and visualization tools, I would not have delivered any results.
That was fun and glamorous part of my job as an analyst, but when I was not creating nice charts and presentations to tell the stories in my data, I was spending time, great amount of time, sometimes up to the wee hours cleaning and verifying my data, I was convinced that was part of my job and I just had to suck it up.
It was only a few months ago that I stumbled upon data quality software – it happened when I joined Informatica. At first I thought they were talking to the wrong person when they started pitching me data quality solutions.
Turns out, the concept of data quality automation is a highly relevant and extremely intuitive subject to me, and for anyone who is dealing with data on the regular basis. Data quality software offers an automated process for data cleansing and is much faster and delivers more accurate results than manual process. To put that in math context, if a data quality tool can reduce the data cleansing effort from 80% to 40% (btw, this is hardly a random number, some of our customers have reported much better results), that means analysts can now free up 40% of their time from scrubbing data, and use that times to do the things they like – playing with data in BI tools, building new models or running more scenarios, producing different views of the data and discovering things they may not be able to before, and do all of that with clean, trusted data. No more bored to death experience, what they are left with are improved productivity, more accurate and consistent results, compelling stories about data, and most important, they can focus on doing the things they like! Not too shabby right?
I am excited about trying out the data quality tools we have here at Informtica, my fellow analysts, you should start looking into them also. And I will check back in soon with more stories to share..
About 15 or so years ago, some friends of mine called me to share great news. Their dating relationship had become serious and they were headed toward marriage. After a romantic proposal and a beautiful ring, it was time to plan the wedding and invite the guests.
This exciting time was confounded by a significant challenge. Though they were very much in love, one of them had an incredibly tough time making wise financial choices. During the wedding planning process, the financially astute fiancée grew concerned about the problems the challenged partner could bring. Even though the financially illiterate fiancée had every other admirable quality, the finance issue nearly created enough doubt to end the engagement. Fortunately, my friends moved forward with the ceremony, were married and immediately went to work on learning new healthy financial habits as a couple.
Let’s segue into how this relates to telecommunications and data, specifically to your average communications operator. Just like a concerned fiancée, you’d think twice about making a commitment to an organization that didn’t have a strong foundation.
Like the financially challenged fiancée, the average operator has a number of excellent qualities: functioning business model, great branding, international roaming, creative ads, long-term prospects, smart people at the helm and all the data and IT assets you can imagine. Unfortunately, despite the externally visible bells and whistles, over time they tend to lose operational soundness around the basics. Specifically, their lack of data quality causes them to forfeit an ever increasing amount of billing revenue. Their poor data costs them millions each year.
A recent set of engagements highlighted this phenomenon. The small carrier (3-6 million subscribers) who implements a more consistent, unique way to manage core subscriber profile and product data could recover underbilling of $6.9 million annually. A larger carrier (10-20 million subscribers) could recover $28.1 million every year from fixing billing errors. (This doesn’t even cover the large Indian and Chinese carriers who have over 100 million customers!)
Typically, a billing error starts with an incorrect set up of a service line item base price and related 30+ discount line variances. Next, the wrong service discount item is applied at contract start. If that did not happen (or on top of those), it will occur when the customer calls in during or right before the end of the first contract period (12-24 months) to complain about the service quality, bill shock, etc. Here, the call center rep will break an existing triple play bundle by deleting an item and setting up a separate non-bundle service line item at a lower price (higher discount). The head of billing actually told us, “our reps just give a residential subscriber a discount of $2 for calling us”. It’s even higher for commercial clients.
To make matters worse, this change will trigger misaligned (incorrect) activation dates or even bill duplication, all of which will have to be fixed later by multiple staff on the BSS and OSS side or may even trigger an investigation project by the revenue assurance department. Worst case, the deletion of the item from the bundle (especially for B2B clients) will not terminate the wholesale cost the carrier still owes a national carrier for a broadband line, which often is 1/3 of the retail price for a business customer.
To come full circle to my initial “accounting challenged” example; would you marry (invest in) this organization? Do you think this can or should be solved in a big bang approach or incrementally? Where would you start: product management, the service center, residential or commercial customers?
Observations and illustrations contained in this post are estimates only and are based entirely upon information provided by the prospective customer and on our observations and benchmarks. While we believe our recommendations and estimates to be sound, the degree of success achieved by the prospective customer is dependent upon a variety of factors, many of which are not under Informatica’s control and nothing in this post shall be relied upon as representative of the degree of success that may, in fact, be realized and no warranty or representation of success, either express or implied, is made.
Did I really compare data quality to flushing toilet paper? Yeah, I think I did. Makes me laugh when I read that, but still true. And yes, I am still playing with more data. This time it’s a location schedule for earthquake risk. I see a 26-story structure with a building value of only $136,000 built in who knows what year. I’d pull my hair out if it weren’t already shaved off.
So let’s talk about the six steps for data quality competency in underwriting. These six steps are standard in the enterprise. But, what we will discuss is how to tackle these in insurance underwriting. And more importantly, what is the business impact to effective adoption of the competency. It’s a repeating self-reinforcing cycle. And when done correctly can be intelligent and adaptive to changing business needs.
Profile – Effectively profile and discover data from multiple sources
We’ll start at the beginning, a very good place to start. First you need to understand your data. Where is it from and in what shape does it come? Whether internal or external sources, the profile step will help identify the problem areas. In underwriting, this will involve a lot of external submission data from brokers and MGAs. This is then combined with internal and service bureau data to get a full picture of the risk. Identify you key data points for underwriting and a desired state for that data. Once the data is profiled, you’ll get a very good sense of where your troubles are. And continually profile as you bring other sources online using the same standards of measurement. As a side, this will also help in remediating brokers that are not meeting the standard.
Measure – Establish data quality metrics and targets
As an underwriter you will need to determine what is the quality bar for the data you use. Usually this means flagging your most critical data fields for meeting underwriting guidelines. See where you are and where you want to be. Determine how you will measure the quality of the data as well as desired state. And by the way, actuarial and risk will likely do the same thing on the same or similar data. Over time it all comes together as a team.
Design – Quickly build comprehensive data quality rules
This is the meaty part of the cycle, and fun to boot. First look to your desired future state and your critical underwriting fields. For each one, determine the rules by which you normally fix errant data. Like what you do when you see a 30-story wood frame structure? How do you validate, cleanse and remediate that discrepancy? This may involve fuzzy logic or supporting data lookups, and can easily be captured. Do this, write it down, and catalog it to be codified in your data quality tool. As you go along you will see a growing library of data quality rules being compiled for broad use.
Deploy – Native data quality services across the enterprise
Once these rules are compiled and tested, they can be deployed for reuse in the organization. This is the beautiful magical thing that happens. Your institutional knowledge of your underwriting criteria can be captured and reused. This doesn’t mean just once, but reused to cleanse existing data, new data and everything going forward. Your analysts will love you, your actuaries and risk modelers will love you; you will be a hero.
Review – Assess performance against goals
Remember those goals you set for your quality when you started? Check and see how you’re doing. After a few weeks and months, you should be able to profile the data, run the reports and see that the needle will have moved. Remember that as part of the self-reinforcing cycle, you can now identify new issues to tackle and adjust those that aren’t working. One metric that you’ll want to measure over time is the increase of higher quote flow, better productivity and more competitive premium pricing.
Monitor – Proactively address critical issues
Now monitor constantly. As you bring new MGAs online, receive new underwriting guidelines or launch into new lines of business you will repeat this cycle. You will also utilize the same rule set as portfolios are acquired. It becomes a good way to sanity check the acquisition of business against your quality standards.
In case it wasn’t apparent your data quality plan is now more automated. With few manual exceptions you should not have to be remediating data the way you were in the past. In each of these steps there is obvious business value. In the end, it all adds up to better risk/cat modeling, more accurate risk pricing, cleaner data (for everyone in the organization) and more time doing the core business of underwriting. Imagine if you can increase your quote volume simply by not needing to muck around in data. Imagine if you can improve your quote to bind ratio through better quality data and pricing. The last time I checked, that’s just good insurance business.
And now for something completely different…cats on pianos. No, just kidding. But check here to learn more about Informatica’s insurance initiatives.
The growth of big data drives many things, including the use of cloud-based resources, the growth of non-traditional databases, and, of course, the growth of data integration. What’s typically not as well understood are the required patterns of data integration, or, the ongoing need for better and more innovative data cleansing tools.
Indeed, while writing Big Data@Work: Dispelling the Myths, Uncovering the Opportunities, Tom Davenport observed data scientists at work. During his talk at VentureBeat’s DataBeat conference, Davenport said data scientists would need better data integration and data cleansing tools before they’d be able to keep up with the demand within organizations.
But Davenport is not alone. Most who deploy big data systems see the need for data integration and data cleansing tools. In most instances, not having those tools in place hindered progress.
I would agree with Davenport, in that the number one impediment to moving to any type of big data is how to clean and move data. Addressing that aspect of big data is Job One for enterprise IT.
The fact is, just implementing Hadoop-based databases won’t make a big data system work. Indeed, the data must come from existing operational data stores, and leverage all types of interfaces and database models. The fundamental need to translate the data structure and content to effectively move from one data store (or stores, typically) to the big data systems has more complexities than most enterprises understand.
The path forward may require more steps than originally anticipated, and perhaps the whole big data thing was sold as something that’s much easier than it actually is. My role for the last few years is to be the guy who lets enterprises know that data integration and data cleansing are core components to the process of building and deploying big data systems. You may as well learn to deal with it early in the process.
The good news is that data integration is not a new concept, and the technology is more than mature. What’s more, data cleansing tools can now be a part of the data integration technology offerings, and actually clean the data as it moves from place to place, and do so in near real-time.
So, doing big data anytime soon? Now is the time to define your big data strategy, in terms of the new technology you’ll be dragging into the enterprise. It’s also time to expand or change the use of data integration and perhaps the enabling technology that is built or designed around the use of big data.
I hate to sound like broken record, but somebody has to say this stuff.
I was just looking at some data I found. Yes, real data, not fake demo stuff. Real hurricane location analysis with modeled loss numbers. At first glance, I thought it looked good. There are addresses, latitudes/longitudes, values, loss numbers and other goodies like year built and construction codes. Yes, just the sort of data that an underwriter would look at when writing a risk. But after skimming through the schedule of locations a few things start jumping out at me. So I dig deeper. I see a multi-million dollar structure in Palm Beach, Florida with $0 in modeled loss. That’s strange. And wait, some of these geocode resolutions look a little coarse. Are they tier one or tier two counties? Who would know? At least all of the construction and occupancy codes have values, albeit they look like defaults. Perhaps it’s time to talk about data quality.
This whole concept of data quality is a tricky one. As cost in acquiring good data is weighed against speed of underwriting/quoting and model correctness I’m sure some tradeoffs are made. But the impact can be huge. First, incomplete data will either force defaults in risk models and pricing or add mathematical uncertainty. Second, massively incomplete data chews up personnel resources to cleanse and enhance. And third, if not corrected, the risk profile will be wrong with potential impact to pricing and portfolio shape. And that’s just to name a few.
I’ll admit it’s daunting to think about. Imagine tens of thousands of submissions a month. Schedules of thousands of locations received every day. Can there even be a way out of this cave? The answer is yes, and that answer is a robust enterprise data quality infrastructure. But wait, you say, enterprise data quality is an IT problem. Yeah, I guess, just like trying to flush an entire roll of toilet paper in one go is the plumber’s problem. Data quality in underwriting is a business problem, a business opportunity and has real business impacts.
Join me in Part 2 as I outline the six steps for data quality competency in underwriting with tangible business benefits and enterprise impact. And now that I have you on the edge of your seats, get smart about the basics of enterprise data quality.
According to a recent article in the LA Times, healthcare costs in the United States far exceed costs in other countries. For example, heart bypass surgery costs an average of $75,345 in the U.S. compared to $15,742 in the Netherlands and $16,492 in Argentina. In the U.S. healthcare accounts for 18% of the U.S. GDP and is increasing.
Michelle Blackmer is an healthcare industry expert at Informatica. In this interview, she explains why business as usual isn’t good enough anymore. Healthcare organizations are rethinking how they do business in an effort to improve outcomes, reduce costs, and comply with regulatory pressures such as the Affordable Care Act (ACA). Michelle believes a data-driven healthcare culture is foundational to personalized medicine and discusses the importance of clean, safe and connected data in executing a successful transformation.
Q. How is the healthcare industry responding to the rising costs of healthcare?
In response to the rising costs of healthcare, regulatory pressures (i.e. Affordable Care Act (ACA)), and the need to better patient outcomes at lower costs, the U.S. healthcare industry is transforming from a volume-based to a value-based model. In this new model, healthcare organizations need to invest in delivering personalized medicine.
To appreciate the potential of personalized medicine, think about your own healthcare experience. It’s typically reactive. You get sick, you go to the doctor, the doctor issues a prescription and you wait a couple of days to see if that drug works. If it doesn’t, you call the doctor and she tries another drug. This process is tedious, painful and costly.
Now imagine if you had a chronic disease like depression or cancer. On average, any given prescription drug only works for half of those who take it. Among cancer patients, the rate of ineffectiveness jumps to 75 percent. Anti-depressants are effective in only 62 percent of those who take them.
Organizations like MD Anderson and UPMC aim to put an end to cancer. They are combining scientific research with access to clean, safe and connected data (data of all types including genomic data). The insights revealed will empower personalized chemotherapies. Personalized medicine offers customized treatments based on patient history and best practices. Personalized medicine will transform healthcare delivery. Click on the links to watch videos about their transformational work.
Q. What role does data play in enabling personalized medicine?
Data is foundational to value-based care and personalized medicine. Not just any data will do. It needs to be clean, safe and connected data. It needs to be delivered rapidly across hallways and across networks.
As an industry, healthcare is at a stage where meaningful electronic data is being generated. Now you need to ensure that the data is accessible and trustworthy so that it can be rapidly analyzed. As data is aggregated across the ecosystem, married with financial and genomic data, data quality issues become more obvious. It’s vital that you can define the data issues so the people can spend their time analyzing the data to gain insights instead of wading through and manually resolving data quality issues.
The ability to trust data will differentiate leaders from the followers. Leaders will advance personalized medicine because they rely on clean, safe and connected data to:
1) Practice analytics as a core competency
2) Define evidence, deliver best practice care and personalize medicine
3) Engage patients and collaborate to foster strong, actionable relationships
Take a look at this Healthcare eBook for more on this topic: Potential Unlocked: Transforming Healthcare by Putting Information to Work.
Q. What is holding healthcare organizations back from managing their healthcare data like other mission-critical assets?
When you say other mission-critical assets, I think of facilitates, equipment, etc. Each of these assets has people and money assigned to manage and maintain them. The healthcare organizations I talk to who are highly invested in personalized medicine recognize that data is mission-critical. They are investing in the people, processes and technology needed to ensure data is clean, safe and connected. The technology includes data integration, data quality and master data management (MDM).
What’s holding other healthcare organizations back is that while they realize they need data governance, they wrongly believe they need to hire big teams of “data stewards” to be successful. In reality, you don’t need to hire a big team. Use the people you already have doing data governance. You may not have made this a formal part of their job description and they might not have data governance technologies yet, but they do have the skillset and they are already doing the work of a data steward.
So while a technology investment is required and you need people who can use the technology, start by formalizing the data stewardship work people are doing already as part of their current job. This way you have people who understand the data, taking an active role in the management of the data and they even get excited about it because their work is being recognized. IT takes on the role of enabling these people instead of having responsibility for all things data.
Q. Can you share examples of how immature information governance is a serious impediment to healthcare payers and providers?
Sure, without information governance, data is not harmonized across sources and so it is hard to make sense of it. This isn’t a problem when you are one business unit or one department, but when you want to get a comprehensive view or a view that incorporates external sources of information, this approach falls apart.
For example, let’s say the cardiology department in a healthcare organization implements a dashboard. The dashboard looks impressive. Then a group of physicians sees the dashboard, point out erroes and ask where the information (i.e. diagnosis or attending physician) came from. If you can’t answer these questions, trace the data back to its sources, or if you have data inconsistencies, the dashboard loses credibility. This is an example of how analytics fail to gain adoption and fail to foster innovation.
Q. Can you share examples of what data-driven healthcare organizations are doing differently?
Certainly, while many are just getting started on their journey to becoming data-driven, I’m seeing some inspiring examples, including:
- Implementing data governance for healthcare analytics. The program and data is owned by the business and enabled by IT and supported by technology such as data integration, data quality and MDM.
- Connecting information from across the entire healthcare ecosystem including 3rd party sources like payers, state agencies, and reference data like credit information from Equifax, firmographics from Dun & Bradstreet or NPI numbers from the national provider registry.
- Establishing consistent data definitions and parameters
- Thinking about the internet of things (IoT) and how to incorporate device data into analysis
- Engaging patients through non-traditional channels including loyalty programs and social media; tracking this information in a customer relationship management (CRM) system
- Fostering collaboration by understanding the relationships between patients, providers and the rest of the ecosystem
- Analyzing data to understand what is working and what is not working so that they can drive out unwanted variations in care
Q. What advice can you give healthcare provider and payer employees who want access to high quality healthcare data?
As with other organizational assets that deliver value—like buildings and equipment—data requires a foundational investment in people and systems to maximize return. In other words, institutions and individuals must start managing their mission-critical data with the same rigor they manage other mission-critical enterprise assets.
Q. Anything else you want to add?
Yes, I wanted to thank our 14 visionary customer executives at data-driven healthcare organizations such as MD Anderson, UPMC, Quest Diagnostics, Sutter Health, St. Joseph Health, Dallas Children’s Medical Center and Navinet for taking time out of their busy schedules to share their journeys toward becoming data-driven at Informatica World 2014. In our next post, I’ll share some highlights about how they are using data, how they are ensuring it is clean, safe and connected and a few data management best practices. InformaticaWorld attendees will be able to download presentations starting today! If you missed InformaticaWorld 2014, stay tuned for our upcoming webinars featuring many of these examples.