Category Archives: Data Quality
“Inaccurate, inconsistent and disconnected supplier information prohibits us from doing accurate supplier spend analysis, leveraging discounts, comparing and choosing the best prices, and enforcing corporate standards.”
This is quotation from a manufacturing company executive. It illustrates the negative impact that poorly managed supplier information can have on a company’s ability to cut costs and achieve revenue targets.
Many supply chain and procurement teams at large companies struggle to see the total relationship they have with suppliers across product lines, business units and regions. Why? Supplier information is scattered across dozens or hundreds of Enterprise Resource Planning (ERP) and Accounts Payable (AP) applications. Too much valuable time is spent manually reconciling inaccurate, inconsistent and disconnected supplier information in an effort to see the big picture. All this manual effort results in back office administrative costs that are higher than they should be.
Do these quotations from supply chain leaders and their teams sound familiar?
“We have 500,000 suppliers. 15-20% of our supplier records are duplicates. 5% are inaccurate.”
“I get 100 e-mails a day questioning which supplier to use.”
“To consolidate vendor reporting for a single supplier between divisions is really just a guess.”
“Every year 1099 tax mailings get returned to us because of invalid addresses, and we play a lot of Schedule B fines to the IRS.”
“Two years ago we spent a significant amount of time and money cleansing supplier data. Now we are back where we started.”
Please join me and Naveen Sharma, Director of the Master Data Management (MDM) Practice at Cognizant for a Webinar, Supercharge Your Supply Chain Applications with Better Supplier Information, on Tuesday, July 29th at 10 am PT.
During the Webinar, we’ll explain how better managing supplier information can help you achieve the following goals:
- Accelerate supplier onboarding
- Mitiate the risk of supply disruption
- Better manage supplier performance
- Streamline billing and payment processes
- Improve supplier relationship management and collaboration
- Make it easier to evaluate non-compliance with Service Level Agreements (SLAs)
- Decrease costs by negotiating favorable payment terms and SLAs
I hope you can join us for this upcoming Webinar!
“Not only do we underestimate the cost for projects up to 150%, but we overestimate the revenue it will generate.” This quotation from an Energy & Petroleum (E&P) company executive illustrates the negative impact of inaccurate, inconsistent and disconnected well data and asset data on revenue potential.
“Operational Excellence” is a common goal of many E&P company executives pursuing higher growth targets. But, inaccurate, inconsistent and disconnected well data and asset data may be holding them back. It obscures the complete picture of the well information lifecycle, making it difficult to maximize production efficiency, reduce Non-Productive Time (NPT), streamline the oilfield supply chain, calculate well by-well profitability, and mitigate risk.
To explain how E&P companies can better manage well data and asset data, we hosted a webinar, “Attention E&P Executives: Streamlining the Well Information Lifecycle.” Our well data experts Stephanie Wilkin, Senior Principal Consultant at Noah Consulting, and Stephan Zoder, Director of Value Engineering at Informatica shared some advice. E&P companies should reevaluate “throwing more bodies at a data cleanup project twice a year.” This approach does not support the pursuit of operational excellence.
In this interview, Stephanie shares details about the award-winning collaboration between Noah Consulting and Devon Energy to create a single trusted source of well data, which is standardized and mastered.
Q. Congratulations on winning the 2014 Innovation Award, Stephanie!
A. Thanks Jakki. It was really exciting working with Devon Energy. Together we put the technology and processes in place to manage and master well data in a central location and share it with downstream systems on an ongoing basis. We were proud to win the 2014 Innovation Award for Best Enterprise Data Platform.
Q. What was the business need for mastering well data?
A. As E&P companies grow so do their needs for business-critical well data. All departments need clean, consistent and connected well data to fuel their applications. We implemented a master data management (MDM) solution for well data with the goals of improving information management, business productivity, organizational efficiency, and reporting.
Q. How long did it take to implement the MDM solution for well data?
A. The Devon Energy project kicked off in May of 2012. Within five months we built the complete solution from gathering business requirements to development and testing.
Q. What were the steps in implementing the MDM solution?
A: The first and most important step was securing buy-in on a common definition for master well data or Unique Well Identifier (UWI). The key was to create a definition that would meet the needs of various business functions. Then we built the well master, which would be consistent across various systems, such as G&G, Drilling, Production, Finance, etc. We used the Professional Petroleum Data Management Association (PPDM) data model and created more than 70 unique attributes for the well, including Lahee Class, Fluid Direction, Trajectory, Role and Business Interest.
As part of the original go-live, we had three source systems of well data and two target systems connected to the MDM solution. Over the course of the next year, we added three additional source systems and four additional target systems. We did a cross-system analysis to make sure every department has the right wells and the right data about those wells. Now the company uses MDM as the single trusted source of well data, which is standardized and mastered, to do analysis and build reports.
Q. What’s been the traditional approach for managing well data?
A. Typically when a new well is created, employees spend time entering well data into their own systems. For example, one person enters well data into the G&G application. Another person enters the same well data into the Drilling application. A third person enters the same well data into the Finance application. According to statistics, it takes about 30 minutes to enter wells into a particular financial application.
So imagine if you need to add 500 new wells to your systems. This is common after a merger or acquisition. That translates to roughly 250 hours or 6.25 weeks of employee time saved on the well create process! By automating across systems, you not only save time, you eliminate redundant data entry and possible errors in the process.
Q. That sounds like a painfully slow and error-prone process.
A. It is! But that’s only half the problem. Without a single trusted source of well data, how do you get a complete picture of your wells? When you compare the well data in the G&G system to the well data in the Drilling or Finance systems, it’s typically inconsistent and difficult to reconcile. This leads to the question, “Which one of these systems has the best version of the truth?” Employees spend too much time manually reconciling well data for reporting and decision-making.
Q. So there is a lot to be gained by better managing well data.
A. That’s right. The CFO typically loves the ROI on a master well data project. It’s a huge opportunity to save time and money, boost productivity and get more accurate reporting.
Q: What were some of the business requirements for the MDM solution?
A: We couldn’t build a solution that was narrowly focused on meeting the company’s needs today. We had to keep the future in mind. Our goal was to build a framework that was scalable and supportable as the company’s business environment changed. This allows the company to add additional data domains or attributes to the well data model at any time.
Q: Why did you choose Informatica MDM?
A: The decision to use Informatica MDM for the MDM Trust Framework came down to the following capabilities:
- Match and Merge: With Informatica, we get a lot of flexibility. Some systems carry the API or well government ID, but some don’t. We can match and merge records differently based on the system.
- X-References: We keep a cross-reference between all the systems. We can go back to the master well data and find out where that data came from and when. We can see where changes have occurred because Informatica MDM tracks the history and lineage.
- Scalability: This was a key requirement. While we went live after only 5 months, we’ve been continually building out the well master based on the requiremets of the target systems.
- Flexibility: Down the road, if we want to add an additional facet or classification to the well master, the framework allows for that.
- Simple Integration: Instead of building point-to-point integrations, we use the hub model.
In addition to Informatica MDM, our Noah Consulting MDM Trust Framework includes Informatica PowerCenter for data integration, Informatica Data Quality for data cleansing and Informatica Data Virtualization.
Q: Can you give some examples of the business value gained by mastering well data?
A: One person said to me, “I’m so overwhelmed! We’ve never had one place to look at this well data before.” With MDM centrally managing master well data and fueling key business applications, many upstream processes can be optimized to achieve their full potential value.
People spend less time entering well data on the front end and reconciling well data on the back end. Well data is entered once and it’s automatically shared across all systems that need it. People can trust that it’s consistent across systems. Also, because the data across systems is now tied together, it provides business value they were unable to realize before, such as predictive analytics.
Q. What’s next?
A. There’s a lot of insight that can be gained by understanding the relationships between the well, and the people, equipment and facilities associated with it. Next, we’re planning to add the operational hierarchy. For example, we’ll be able to identify which production engineer, reservoir engineer and foreman are working on a particular well.
We’ve also started gathering business requirements for equipment and facilities to be tied to each well. There’s a lot more business value on the horizon as the company streamlines their well information lifecycle and the valuable relationships around the well.
If you missed the webinar, you can watch the replay now: Attention E&P Executives: Streamlining the Well Information Lifecycle.
A few weeks ago, a regional US bank asked me to perform some compliance and use case analysis around fixing their data management situation. This bank prides itself on customer service and SMB focus, while using large-bank product offerings. However, they were about a decade behind the rest of most banks in modernizing their IT infrastructure to stay operationally on top of things.
This included technologies like ESB, BPM, CRM, etc. They also were a sub-optimal user of EDW and analytics capabilities. Having said all this; there was a commitment to change things up, which is always a needed first step to any recovery program.
As I conducted my interviews across various departments (list below) it became very apparent that they were not suffering from data poverty (see prior post) but from lack of accessibility and use of data.
- Vendor Management & Risk
- Commercial and Consumer Depository products
- Credit Risk
- HR & Compensation
- Private Banking
- Customer Solutions
This lack of use occurred across the board. The natural reaction was to throw more bodies and more Band-Aid marts at the problem. Users also started to operate under the assumption that it will never get better. They just resigned themselves to mediocrity. When some new players came into the organization from various systemically critical banks, they shook things up.
Here is a list of use cases they want to tackle:
- The proposition of real-time offers based on customer events as simple as investment banking products for unusually high inflow of cash into a deposit account.
- The use of all mortgage application information to understand debt/equity ratio to make relevant offers.
- The capture of true product and customer profitability across all lines of commercial and consumer products including trust, treasury management, deposits, private banking, loans, etc.
- The agile evaluation, creation, testing and deployment of new terms on existing and products under development by shortening the product development life cycle.
- The reduction of wealth management advisors’ time to research clients and prospects.
- The reduction of unclaimed use tax, insurance premiums and leases being paid on consumables, real estate and requisitions due to the incorrect status and location of the equipment. This originated from assets no longer owned, scrapped or moved to different department, etc.
- The more efficient reconciliation between transactional systems and finance, which often uses multiple party IDs per contract change in accounts receivable, while the operating division uses one based on a contract and its addendums. An example would be vendor payment consolidation, to create a true supplier-spend; and thus, taking advantage of volume discounts.
- The proactive creation of central compliance footprint (AML, 314, Suspicious Activity, CTR, etc.) allowing for quicker turnaround and fewer audit instances from MRAs (matter requiring attention).
MONEY TO BE MADE – PEOPLE TO SEE
Adding these up came to about $31 to $49 million annually in cost savings, new revenue or increased productivity for this bank with $24 billion total assets.
So now that we know there is money to be made by fixing the data of this organization, how can we realistically roll this out in an organization with many competing IT needs?
The best way to go about this is to attach any kind of data management project to a larger, business-oriented project, like CRM or EDW. Rather than wait for these to go live without good seed data, why not feed them with better data as a key work stream within their respective project plans?
To summarize my findings I want to quote three people I interviewed. A lady, who recently had to struggle through an OCC audit told me she believes that the banks, which can remain compliant at the lowest cost will ultimately win the end game. Here she meant particularly tier 2 and 3 size organizations. A gentleman from commercial banking left this statement with me, “Knowing what I know now, I would not bank with us”. The lady from earlier also said, “We engage in spreadsheet Kung Fu”, to bring data together.
Given all this, what would you suggest? Have you worked with an organization like this? Did you encounter any similar or different use cases in financial services institutions?
Recently, I had the opportunity to interview half dozen CIOs and half dozen CFOs. Kind of like a marriage therapist, I got to see each party’s story about the relationship. CFOs, in particular, felt that the quality of the relationship could impact their businesses’ success. Armed with this knowledge, I wanted to see if I could help each leader build a better working relationship. Previously, I let CIO’s know about the emergence and significance of the strategic CFO. In today’s post, l will start by sharing the CIOs perspective on the CFO relationship and then I will discuss how CFOs can build better CIO relationships.
CIOs feel under the gun these days!
If you don’t know, CIOs feel under the gun these days. CIOs see their enterprises demanding ubiquitous computing. Users want to use their apps and expect corporate apps to look like their personal apps such as Facebook. They want to bring their own preferred devices. Most of all, , they want all their data on any device when they need it. This means CIOs are trying to manage a changing technical landscape of mobile, cloud, social, and big data. These are all vying for both dollars and attention. As a result, CIOs see their role in a sea change. Today, they need to focus less on building things and more on managing vendors. CIOs say that they need to 1) better connect what IT is doing to support the business strategy; 2) improve technical orchestration; and 3) improve process excellence. This is a big and growing charter.
CIOs see the CFO conversation being just about the numbers
CIOs worry that you don’t understand how many things are now being run by IT and that historical percentages of revenue may no longer appropriate. Think about healthcare, which used to be a complete laggard in technology but today it is having everything digitalized. Even a digital thermometer plugs into an iPad so it directly communicates with a patient record. The world has clearly changed. And CIOs worry that you view IT as merely a cost center and that you do not see the value generated through IT investment or the asset that information provides to business decision makers. However, the good news is that I believe that a different type of discussion is possible. And that CFOs have the opportunity to play an important role in helping to shape the value that CIOs deliver to the business.
CFOs should share their experience and business knowledge
CFOs that I talked to said that they believe the CFO/CIO relationship needs to be complimentary and that the roles have the most concentric rings. These CFOs believe that the stronger the relationship the better it is for their business. One area that you can help the CIO is in sharing your knowledge of the business and business needs. CIOs are trying to get closer to the business and you can help build this linkage and to support requests that come out of this process. Clearly, an aligned CFO can be “one of the biggest advocates of the CIO”. Given this, make sure that you are on your CIOs Investment Committee.
Tell your CIO about your data pains
CFOs need to be good customers too. CFOs that I talked to told me that they know their business has “a data issue”. They worry about the integrity of data from the source. CFOs see their role as relying increasingly on timely, accurate data. They, also, know they have disparate systems and too much manual stuff going on in the back office. For them, integration needs to exist from the frontend to the backend. Their teams personally feel the large number of manual steps.
For this reasons, CFOs, we talked to, believe that the integration of data is a big issue whether they are in a small or large business. Have you talked to your CIO about data integration or quality projects to change the ugliness that you have to live with day in day out? It will make you and the business more efficient. One CFO was blunt here saying “making life easier is all about the systems. If the systems suck then you cannot trust the numbers when you get them. You want to access the numbers easily, timely, and accurately. You want to make easier to forecast so you can set expectations with the business and externally”.
At the same time, CFOs that I talked to worried about the quality of financial and business data analysis. Once he had data, he worried about being able to analyze information effectively. Increasingly, CFOs say that they need to help drive synergies across their businesses. At the same time, CFOs increasingly need to manage upward with information. They want information for decision makers so they can make better decisions.
Changing the CIO Dialog
So it is clear that CFOs like you see data as a competitive advantage in particular financial data. The question is, as your unofficial therapist, why aren’t you having a discussion with your CIO not just about the numbers or financial justification for this or that system and instead, asking about the+ integration investment that can make your integration problems go away.
The strategic CFO is different than the “1975 Controller CFO”
Traditionally, CIOs have tended to work with what one CIO called a “1975 Controller CFO”. For this reason, the relationship between CIOs and CFOs was expressed well in a single word “contentious”. But a new type of CFO is emerging that offers the potential of different type of relationship. These so called “strategic CFOs” can be an effective ally for CIOs. The question is which type of CFO do you have? In this post, I will provide you with a bit of a litmus test so you can determine what type of CFO you have but more importantly, I will share how you can take maximum advantage of having a strategic-oriented CFO relationship. But first let’s hear a bit more of the CIOs reactions to CFOs.
Views of CIOs according to CIO interviews
Clearly, “the relationship…with these CFOs is filled with friction”. Controller CFOs “do not get why so many things require IT these days. They think that things must be out of whack. One CIO said that they think technology should only cost 2-3% of revenue while it can easily reach 8-9% of revenue these days.” Another CIO complained by saying their discussion with a Controller CFOs is only about IT productivity and effectiveness. In their eyes, this has limited the topics of discussion to IT cost reduction, IT produced business savings, and the soundness of the current IT organization. Unfortunately, this CIO believe that Controller CFOs are not concerned with creating business value or sees information as an asset. Instead, they view IT as a cost center. Another CIO says Controller CFOs are just about the numbers and see the CIO role as being about signing checks. It is a classic “demand versus supply” issue. At the same times, CIOs say that they see reporting to Controller CFO as a narrowing function. As well, they believe it signals to the rest of the organization “that IT is not strategic and less important than other business functions”.
What then is this strategic CFO?
In contrast to their controller peers, strategic CFOs often have a broader business background than their accounting and a CPA peers. Many have, also, pursued an MBA. Some have public accounting experience. Others yet come from professions like legal, business development, or investment banking.
More important than where they came from, strategic CFOs see a world that is about more than just numbers. They want to be more externally facing and to understand their company’s businesses. They tend to focus as much on what is going to happen as they do on what has happened. Remember, financial accounting is backward facing. Given this, strategic CFOs spend a lot of time trying to understand what is going on in their firm’s businesses. One strategic CFO said that they do this so they can contribute and add value—I want to be a true business leader. And taking this posture often puts them in the top three decision makers for their business. There may be lessons in this posture for technology focused CIOs.
Why is a strategic CFO such a game changer for CIO?
One CIO put it this way. “If you have a modern day CFO, then they are an enabler of IT”. Strategic CFO’s agree. Strategic CFOs themselves as having the “the most concentric circles with the CIO”. They believe that they need “CIOs more than ever to extract data to do their jobs better and to provide the management information business leadership needs to make better business decisions”. At the same time, the perspective of a strategic CFO can be valuable to the CIO because they have good working knowledge of what the business wants. They, also, tend to be close to the management information systems and computer systems. CFOs typically understand the needs of the business better than most staff functions. The CFOs, therefore, can be the biggest advocate of the CIO. This is why strategic CFOs should be on the CIOs Investment Committee. Finally, a strategic CFO can help a CIO ensure their technology selections meet affordability targets and are compliant with the corporate strategy.
Are the priorities of a strategic CFO different?
Strategic CFOs still care P&L, Expense Management, Budgetary Control, Compliance, and Risk Management. But they are also concerned about performance management for the enterprise as whole and senior management reporting. As well they, they want to do the above tasks faster so finance and other functions can do in period management by exception. For this reason they see data and data analysis as a big issue.
Strategic CFOs care about data integration
In interviews of strategic CFOs, I saw a group of people that truly understand the data holes in the current IT system. And they intuit firsthand the value proposition of investing to fix things here. These CFOs say that they worry “about the integrity of data from the source and about being able to analyze information”. They say that they want the integration to be good enough that at the push of button they can get an accurate report. Otherwise, they have to “massage the data and then send it through another system to get what you need”.
These CFOs say that they really feel the pain of systems not talking to each other. They understand this means making disparate systems from the frontend to the backend talk to one another. But they, also, believe that making things less manual will drive important consequences including their own ability to inspect books more frequently. Given this, they see data as a competitive advantage. One CFO even said that they thought data is the last competitive advantage.
Strategic CFOs are also worried about data security. They believe their auditors are going after this with a vengeance. They are really worried about getting hacked. One said, “Target scared a lot of folks and was to many respects a watershed event”. At the same time, Strategic CFOs want to be able to drive synergies across the business. One CFO even extolled the value of a holistic view of customer. When I asked why this was a finance objective versus a marketing objective, they said finance is responsible for business metrics and we have gaps in our business metrics around customer including the percentage of cross sell is taking place between our business units. Another CFO amplified on this theme by saying that “increasingly we need to manage upward with information. For this reason, we need information for decision makers so they can make better decisions”. Another strategic CFO summed this up by saying “the integration of the right systems to provide the right information needs to be done so we and the business have the right information to manage and make decisions at the right time”.
So what are you waiting for?
If you are lucky enough to have a Strategic CFO, start building your relationship. And you can start by discussing their data integration and data quality problems. So I have a question for you. How many of you think you have a Controller CFO versus a Strategic CFO? Please share here.
In my last blog, I talked about the dreadful experience of cleaning raw data by hand as a former analyst a few years back. Well, the truth is, I was not alone. At a recent data mining Meetup event in San Francisco bay area, I asked a few analysts: “How much time do you spend on cleaning your data at work?” “More than 80% of my time” and “most my days” said the analysts, and “they are not fun”.
But check this out: There are over a dozen Meetup groups focused on data science and data mining here in the bay area I live. Those groups put on events multiple times a month, with topics often around hot, emerging technologies such as machine learning, graph analysis, real-time analytics, new algorithm on analyzing social media data, and of course, anything Big Data. Cools BI tools, new programming models and algorithms for better analysis are a big draw to data practitioners these days.
That got me thinking… if what analysts said to me is true, i.e., they spent 80% of their time on data prepping and 1/4 of that time analyzing the data and visualizing the results, which BTW, “is actually fun”, quoting a data analyst, then why are they drawn to the events focused on discussing the tools that can only help them 20% of the time? Why wouldn’t they want to explore technologies that can help address the dreadful 80% of the data scrubbing task they complain about?
Having been there myself, I thought perhaps a little self-reflection would help answer the question.
As a student of math, I love data and am fascinated about good stories I can discover from them. My two-year math program in graduate school was primarily focused on learning how to build fabulous math models to simulate the real events, and use those formula to predict the future, or look for meaningful patterns.
I used BI and statistical analysis tools while at school, and continued to use them at work after I graduated. Those software were great in that they helped me get to the results and see what’s in my data, and I can develop conclusions and make recommendations based on those insights for my clients. Without BI and visualization tools, I would not have delivered any results.
That was fun and glamorous part of my job as an analyst, but when I was not creating nice charts and presentations to tell the stories in my data, I was spending time, great amount of time, sometimes up to the wee hours cleaning and verifying my data, I was convinced that was part of my job and I just had to suck it up.
It was only a few months ago that I stumbled upon data quality software – it happened when I joined Informatica. At first I thought they were talking to the wrong person when they started pitching me data quality solutions.
Turns out, the concept of data quality automation is a highly relevant and extremely intuitive subject to me, and for anyone who is dealing with data on the regular basis. Data quality software offers an automated process for data cleansing and is much faster and delivers more accurate results than manual process. To put that in math context, if a data quality tool can reduce the data cleansing effort from 80% to 40% (btw, this is hardly a random number, some of our customers have reported much better results), that means analysts can now free up 40% of their time from scrubbing data, and use that times to do the things they like – playing with data in BI tools, building new models or running more scenarios, producing different views of the data and discovering things they may not be able to before, and do all of that with clean, trusted data. No more bored to death experience, what they are left with are improved productivity, more accurate and consistent results, compelling stories about data, and most important, they can focus on doing the things they like! Not too shabby right?
I am excited about trying out the data quality tools we have here at Informtica, my fellow analysts, you should start looking into them also. And I will check back in soon with more stories to share..
After I graduated from business school, I started reading Fortune Magazine. I guess that I became a regular reader because each issue largely consists of a set of mini-business cases. And over the years, I have even started to read the witty remarks from the managing editor, Andy Serwer. However, this issue’s comments were even more evocative than usual.
Connectivity is perhaps the biggest opportunity of our time
Andy wrote, “Connectivity is perhaps the biggest opportunity of our time. As technology makes the world smaller, it is clear that the countries and companies that connect the best—either in terms of, say traditional infrastructure or through digital networks are in the drivers’ seat”. Andy sees differentiated connectivity as involving two elements–access and content. This is important to note because Andy believes the biggest winners going forward are going to be the best connectors to each.
Enterprises need to evaluate how the collect, refine, and make useful data
But how do enterprises establish world class connectivity to content? I would argue–whether you are talking about large or small data—it comes from improving an enterprise’s abiity collect, refine, and create useful data. In recent CFO research, the importance of enterprise data gathering capabilities was stressed. CFOs said that their enterprises need to “get data right” at the same time as they confirmed that their enterprises in fact have a data issue. The CFOs said that they are worried about the integrity of data from the source forward. And once they manually create clean data, they worry about making this data useful to their enterprises. Why does this data matter so much to the CFO? Because as CFOs get more strategic, they are trying to make sure their firms drive synergies across their businesses.
Business need to make sense of data and get it to business users faster
One CFO said it this way, “data is potentially the only competitive advantage left”. Yet another said, “our businesses needs to make better decisions from data. We need to make sense of data faster.” At the same time leading edge thinkers like Geoffrey Moore has been suggesting that businesses need to move from “systems of record” applications to “system of engagement” applications. This notion suggests the importance of providing more digestible apps, but also the importance of recognizing that the most important apps for business users will provide relevant information for decision making. Put another way, data is clearly becoming fuel to the enterprise decision making.
“Data Fueled Apps” will provide a connectivity advantage
For this reason, “data fueled” apps will be increasingly important to the business. Decision makers these days want to practice “management by walking around” to quote Tom Peter’s Book, “In Search of Excellence”. And this means having critical, fresh data at their fingertips for each and every meeting. And clearly, organizations that provide this type of data connectivity will establish the connectivity advantage that Serwer suggested in his editor comments. This of course applies to consumer facing apps as well. Server, also, comments on the impacts of Apple and Facebook. Most consumers today are far better informed before they make a purchase. The customer facing apps, for example Amazon, that have led the way have provided the relevant information for the consumer to inform them on their purchase journey.
Delivering “Data Fueled Apps” to the Enterprise
But how do you create the enterprise wide connectivity to power the “Data Fueled Apps?” It is clear from the CFOs comments work is needed here. That work involves creating data which is systematically clean, safe, and connected. Why does this data need to be clean? The CFOs we talked to said that when the data is not clean then they have to manually massage the data and then move from system to system. This is not providing the kind of system of engagement envisioned by Geoffrey Moore. What this CFO wants to move to a world where he can access the numbers easily, timely, and accurately”.
Data, also, needs to be safe. This means that only people with access should be able to see data whether we are talking about transactional or analytical data. This may sound obvious, but very few isolate and secure data as it moves from system to system. And lastly, data needs to be connected. Yet another CFO said, “the integration of the right systems to provide the right information needs to be done so we have the right information to manage and make decisions at the right time”. He continued by saying “we really care about technology integration and getting it less manual. It means that we can inspect the books half way through the cycle. And getting less manual means we can close the books even faster. However, if systems don’t talk (connect) to one another, it is a big issue”.
Finally, whether we are discussing big data or small data, we need to make sure the data collected is more relevant and easier to consume. What is needed here is a data intelligence layer provides easy ways to locate useful data and recommend or guide ways to improve the data. This way analysts and leaders can spend less time on searching or preparing data and more time on analyzing the data to connect the business dots. This can involve mapping data relationship across all applications and being able to draw inferences from data to drive real time responses.
So in this new connected world, we need to first set up a data infrastructure to continuously make data clean, safe, and connected regardless of use case. It might not be needed to collect data, but the data infrastructure may be needed to define the connectivity (in the shape of access and content). We also need to make sure that the infrastructure for doing this is reusable so that the time from concept to new data fueled app is minimized. And then to drive informational meaning, we need to layer on top the intelligence. With this, we can deliver “data fueled apps” that enable business users the access and content to drive better business differentiation and decisioning!
In the media there is a constant discussion about a mismatch between the skills that education provides and the capabilities graduates bring to the work place. And, whether they are prepared for work. The lack of large data set use means that skills needed by employers may be missing. I will outline the skills that could be gained by working with large data sets.
Some types of data handling are just high volume. Business intelligence and analytics consume more data than 20 years ago. Handling the increasing volume is important. Research programming and data science are truly part of big data. Even if you are not doing data science, you may be preparing and handling the data sets. Some industries and organisations just have higher volumes of data. Retail is one example. Companies that used to have less volume are obtaining more data as they adapt to the big data world. We should expect the same trend to continue with organisations that have had higher data volumes in the past. They are going to have to handle a much bigger big data experience.
There are practical aspects to handling large data sets. These can lead to experience in storage management and design, data loading, query optimization, parallelization, bandwidth issues and data quality when large data sets are used. And when you take on those issues, architecture skills are needed and can be gained.
Today, the trends known as the Internet of Things, All Things Data, and Data First are forming. As a result there will be demand for graduates who are familiar with handling high volumes of data.
The responsibility for using a large data set falls to the student. Faculty staff need to encourage this though. They often set and guide the students’ goals. A number of large data sets that could be used by students are on the web. An example of one data set would be the Harvard Library Bibliographic Dataset available at http://openmetadata.lib.harvard.edu/bibdata. Another example is the City of Chicago that makes a number of datasets available for download in a wide range of standard formats at https://data.cityofchicago.org/. The advantage of public large data sets is the volume and the opportunity to assess the data quality of the data set. Public data sets can hold many records. They represent many more combinations than we can quickly generate by hand. Using even a small real world data set is a vast improvement over the likely limited number of variations in self-generated data. It may be even better than using a tool to generate data. Such data when downloaded can be manipulated and used as a base for loading.
Loading large data sets is part of being prepared. It requires the use of tools. These tools can be from loaders to full data integration tool suites. A good option for students who need to load data sets is PowerCenter Express. It was announced last year. It is free for use with up to 250,000 rows per day. It is an ideal way to experience a full enterprise data integration tool and work with significantly higher volumes.
Big Data is here and it is a growing trend. And so students need to work with larger data sets than before. It is also feasible. The tools and the data sets the students need to work with large data sets are available. Therefore, in view of the current trends, large data set use should become standard practice in computer science and related courses.
Did I really compare data quality to flushing toilet paper? Yeah, I think I did. Makes me laugh when I read that, but still true. And yes, I am still playing with more data. This time it’s a location schedule for earthquake risk. I see a 26-story structure with a building value of only $136,000 built in who knows what year. I’d pull my hair out if it weren’t already shaved off.
So let’s talk about the six steps for data quality competency in underwriting. These six steps are standard in the enterprise. But, what we will discuss is how to tackle these in insurance underwriting. And more importantly, what is the business impact to effective adoption of the competency. It’s a repeating self-reinforcing cycle. And when done correctly can be intelligent and adaptive to changing business needs.
Profile – Effectively profile and discover data from multiple sources
We’ll start at the beginning, a very good place to start. First you need to understand your data. Where is it from and in what shape does it come? Whether internal or external sources, the profile step will help identify the problem areas. In underwriting, this will involve a lot of external submission data from brokers and MGAs. This is then combined with internal and service bureau data to get a full picture of the risk. Identify you key data points for underwriting and a desired state for that data. Once the data is profiled, you’ll get a very good sense of where your troubles are. And continually profile as you bring other sources online using the same standards of measurement. As a side, this will also help in remediating brokers that are not meeting the standard.
Measure – Establish data quality metrics and targets
As an underwriter you will need to determine what is the quality bar for the data you use. Usually this means flagging your most critical data fields for meeting underwriting guidelines. See where you are and where you want to be. Determine how you will measure the quality of the data as well as desired state. And by the way, actuarial and risk will likely do the same thing on the same or similar data. Over time it all comes together as a team.
Design – Quickly build comprehensive data quality rules
This is the meaty part of the cycle, and fun to boot. First look to your desired future state and your critical underwriting fields. For each one, determine the rules by which you normally fix errant data. Like what you do when you see a 30-story wood frame structure? How do you validate, cleanse and remediate that discrepancy? This may involve fuzzy logic or supporting data lookups, and can easily be captured. Do this, write it down, and catalog it to be codified in your data quality tool. As you go along you will see a growing library of data quality rules being compiled for broad use.
Deploy – Native data quality services across the enterprise
Once these rules are compiled and tested, they can be deployed for reuse in the organization. This is the beautiful magical thing that happens. Your institutional knowledge of your underwriting criteria can be captured and reused. This doesn’t mean just once, but reused to cleanse existing data, new data and everything going forward. Your analysts will love you, your actuaries and risk modelers will love you; you will be a hero.
Review – Assess performance against goals
Remember those goals you set for your quality when you started? Check and see how you’re doing. After a few weeks and months, you should be able to profile the data, run the reports and see that the needle will have moved. Remember that as part of the self-reinforcing cycle, you can now identify new issues to tackle and adjust those that aren’t working. One metric that you’ll want to measure over time is the increase of higher quote flow, better productivity and more competitive premium pricing.
Monitor – Proactively address critical issues
Now monitor constantly. As you bring new MGAs online, receive new underwriting guidelines or launch into new lines of business you will repeat this cycle. You will also utilize the same rule set as portfolios are acquired. It becomes a good way to sanity check the acquisition of business against your quality standards.
In case it wasn’t apparent your data quality plan is now more automated. With few manual exceptions you should not have to be remediating data the way you were in the past. In each of these steps there is obvious business value. In the end, it all adds up to better risk/cat modeling, more accurate risk pricing, cleaner data (for everyone in the organization) and more time doing the core business of underwriting. Imagine if you can increase your quote volume simply by not needing to muck around in data. Imagine if you can improve your quote to bind ratio through better quality data and pricing. The last time I checked, that’s just good insurance business.
And now for something completely different…cats on pianos. No, just kidding. But check here to learn more about Informatica’s insurance initiatives.
The growth of big data drives many things, including the use of cloud-based resources, the growth of non-traditional databases, and, of course, the growth of data integration. What’s typically not as well understood are the required patterns of data integration, or, the ongoing need for better and more innovative data cleansing tools.
Indeed, while writing Big Data@Work: Dispelling the Myths, Uncovering the Opportunities, Tom Davenport observed data scientists at work. During his talk at VentureBeat’s DataBeat conference, Davenport said data scientists would need better data integration and data cleansing tools before they’d be able to keep up with the demand within organizations.
But Davenport is not alone. Most who deploy big data systems see the need for data integration and data cleansing tools. In most instances, not having those tools in place hindered progress.
I would agree with Davenport, in that the number one impediment to moving to any type of big data is how to clean and move data. Addressing that aspect of big data is Job One for enterprise IT.
The fact is, just implementing Hadoop-based databases won’t make a big data system work. Indeed, the data must come from existing operational data stores, and leverage all types of interfaces and database models. The fundamental need to translate the data structure and content to effectively move from one data store (or stores, typically) to the big data systems has more complexities than most enterprises understand.
The path forward may require more steps than originally anticipated, and perhaps the whole big data thing was sold as something that’s much easier than it actually is. My role for the last few years is to be the guy who lets enterprises know that data integration and data cleansing are core components to the process of building and deploying big data systems. You may as well learn to deal with it early in the process.
The good news is that data integration is not a new concept, and the technology is more than mature. What’s more, data cleansing tools can now be a part of the data integration technology offerings, and actually clean the data as it moves from place to place, and do so in near real-time.
So, doing big data anytime soon? Now is the time to define your big data strategy, in terms of the new technology you’ll be dragging into the enterprise. It’s also time to expand or change the use of data integration and perhaps the enabling technology that is built or designed around the use of big data.
I hate to sound like broken record, but somebody has to say this stuff.