Tag Archives: Data Management
At long last, the anxiously awaited rules from the FAA have brought some clarity to the world of commercial drone use. Up until now, commercial drone use has been prohibited. The new rules, of course, won’t sit well with Amazon who would like to drop merchandise on your porch at all hours. But the rules do work really well for insurers who would like to use drones to service their policyholders. So now drones, and soon to be fleets of unmanned cars will be driving the roadways in any numbers of capacities. It seems to me to be an ambulance chaser’s dream come true. I mean who wouldn’t want some seven or eight figure payday from Google for getting rear-ended?
What about “Great Data”? What does that mean in the context of unmanned vehicles, both aerial and terrestrial? Let’s talk about two aspects. First, the business benefits of great data using unmanned drones.
An insurance adjuster or catastrophe responder can leverage an aerial drone to survey large areas from a central location. They will pin point the locations needing attention for further investigation. This is a common scenario that many insurers talk about when the topic of aerial drone use comes up. Second to that is the ability to survey damage in hard to reach locations like roofs or difficult terrain (like farmland). But this is where great data comes into play. Surveying, service and use of unmanned vehicles demands that your data can answer some of the following questions for your staff operating in this new world:
Where am I?
Quality data and geocoded locations as part of that data is critical. In order to locate key risk locations, your data must be able to coordinate with the lat/long of the location recorded by your unmanned vehicles and the location of your operator. Ensure clean data through robust data quality practices.
Where are my policyholders?
Knowing the location of your policyholders not only relies on good data quality, but on knowing who they are and what risks you are there to help service. This requires a total customer relationship solution where you have a full view of not only locations, but risks, coverages and entities making up each policyholder.
What am I looking at?
Archived, current and work in process imaging is a key place where a Big Data environment can assist over time. By comparing saved images with new and processing claims, claims fraud and additional opportunities for service can be detected quickly by the drone operator.
Now that we’ve answered the business value questions and leveraged this new technology to better service policyholders and speed claims, let’s turn to how great data can be used to protect the insurer and drone operator from liability claims. This is important. The FAA has stopped short of requiring commercial drone operators to carry special liability insurance, leaving that instead to the drone operators to orchestrate with their insurer. And now we’re back to great data. As everyone knows, accidents happen. Technology, especially robotic mobile technology is not infallible. Something will crash somewhere, hopefully not causing injury or death, but sadly that too will likely happen. And there is nothing that will keep the ambulance chasers at bay more than robust great data. Any insurer offering liability cover for a drone operator should require that some of the following questions be answered by the commercial enterprise. And the interesting fact is that this information should be readily available if the business questions above have been answered.
- Where was my drone?
- What was it doing?
- Was it functioning properly?
Properly using the same data management technology as in the previous questions will provide valuable data to be used as evidence in the case of liability against a drone operator. Insurers would be wise to ask these questions of their liability policyholders who are using unmanned technology as a way to gauge liability exposure in this brave new world. The key to the assessment of risk being robust data management and great data feeding the insurer’s unmanned policyholder service workers.
Time will tell all the great and imaginative things that will take place with this new technology. One thing is for certain. Great data management is required in all aspects from amazing customer service to risk mitigation in operations. Happy flying to everyone!!
By Philip Russom, TDWI, Research Director for Data Management.
I recently broadcast a really interesting Webinar with David Lyle, a vice president of product strategy at Informatica Corporation. David and I had a “fire-side chat” where we discussed one of the most pressing questions in data management today, namely: How can we prepare great data for great analytics, while still leveraging older best practices in data management? Please allow me to summarize our discussion.
Both old and new requirements are driving organizations toward analytics. David and I started the Webinar by talking about prominent trends:
- Wringing value from big data – The consensus today says that advanced analytics is the primary path to business value from big data and other types of new data, such as data from sensors, devices, machinery, logs, and social media.
- Getting more value from traditional enterprise data – Analytics continues to reveal customer segments, sales opportunities, and threats for risk, fraud, and security.
- Competing on analytics – The modern business is run by the numbers – not just gut feel – to study markets, refine differentiation, and identify competitive advantages.
The rise of analytics is a bit confusing for some data people. As experienced data professionals do more work with advanced forms of analytics (enabled by data mining, clustering, text mining, statistical analysis, etc.) they can’t help but notice that the requirements for preparing analytic data are similar-but-different as compared to their other projects, such as ETL for a data warehouse that feeds standard reports.
Analytics and reporting are two different practices. In the Webinar, David and I talked about how the two involve pretty much the same data management practices, but it different orders and priorities:
- Reporting is mostly about entities and facts you know well, represented by highly polished data that you know well. Squeaky clean report data demands elaborate data processing (for ETL, quality, metadata, master data, and so on). This is especially true of reports that demand numeric precision (about financials or inventory) or will be published outside the organization (regulatory or partner reports).
- Advanced analytics, in general, enables the discovery of new facts you didn’t know, based on the exploration and analysis of data that’s probably new to you. Preparing raw source data for analytics is simple, though at high levels of scale. With big data and other new data, preparation may be as simple as collocating large datasets on Hadoop or another platform suited to data exploration. When using modern tools, users can further prepare the data as they explore it, by profiling, modeling, aggregating, and standardizing data on the fly.
Operationalizing analytics brings reporting and analysis together in a unified process. For example, once an epiphany is discovered through analytics (e.g., the root cause of a new form of customer churn), that discovery should become a repeatable BI deliverable (e.g., metrics and KPIs that enable managers to track the new form of churn in dashboards). In these situations, the best practices of data management apply to a lesser degree (perhaps on the fly) during the early analytic steps of the process, but then are applied fully during the operationalization steps.
Architectural ramifications ensue from the growing diversity of data and workloads for analytics, reporting, multi-structured data, real time, and so on. For example, modern data warehouse environments (DWEs) include multiple tools and data platforms, from traditional relational databases to appliances and columnar databases to Hadoop and other NoSQL platforms. Some are on premises and others are on clouds. On the down side, this results in high complexity, with data strewn across multiple platforms. On the upside, users get great data for great analytics by moving data to a platform within the DWE that’s optimized for a particular data type, analytic workload, or price point, or data management best practice.
For example, a number of data architecture uses cases have emerged successfully in recent years, largely to assure great data for great analytics:
- Leveraging new data warehouse platform types gives analytics the high performance it needs. Toward this end, TDWI has seen many users successfully adopt new platforms based on appliances, columnar data stores, and a variety of in-memory functions.
- Offloading data and its processing to Hadoop frees up capacity on EDWs. And it also gives unstructured and multi-structured data types a platform that is better suited to their management and processing, all at a favorable cost point.
- Virtualizing data assets yields greater agility and simpler data management. Multi-platform data architectures too often entail a lot of data movement among the platforms. But this can be mitigated by federated and virtual data management practices, as well as by emerging practices for data lakes and enterprise data hubs.
If you’d like to hear more of my discussion with Informatica’s David Lyle, please replay the Webinar from the Informatica archive.
The Ponemon Institute stated that the biggest concern for security professionals is that they do not know where sensitive data resides. Informatica’s Intelligent Data Platform provides data security professionals with the technology required to discover, profile, classify and assess the risk of confidential and sensitive data.
Last year, we began significant investments in data security R&D support the initiative. This year, we continue the commitment by organizing around the vision. I am thrilled to be leading the Informatica Data Security Group, a newly-formed business unit comprised of a team dedicated to data security innovation. The business unit includes the former Application ILM business unit which consists of data masking, test data management and data archive technologies from previous acquisitions, including Applimation, ActiveBase, and TierData.
By having a dedicated business unit and engineering resources applying Informatica’s Intelligent Data Platform technology to a security problem, we believe we can make a significant difference addressing a serious challenge for enterprises across the globe. The newly formed Data Security Group will focus on new innovations in the data security intelligence market, while continuing to invest and enhance our existing data-centric security solutions such as data masking, data archiving and information lifecycle management solutions.
The world of data is transforming around us and we are committed to transforming the data security industry to keep our customer’s data clean, safe and connected.
For more details regarding how these changes will be reflected in our products, message and support, please refer to the FAQs listed below:
Q: What is the Data Security Group (DSG)?
A: Informatica has created a newly formed business unit, the Informatica Data Security Group, as a dedicated team focusing on data security innovation to meet the needs of our customers while leveraging the Informatica Intelligent Data Platform
Q: Why did Informatica create a dedicated Data Security Group business unit?
A: Reducing Risk is among the top 3 business initiatives for our customers in 2015. Data Security is a top IT and business initiative for just about every industry and organization that store sensitive, private, regulated or confidential data. Data Security is a Board room topic. By building upon our success with the Application ILM product portfolio and the Intelligent Data Platform, we can address more pressing issues while solving mission-critical challenges that matter to most of our customers.
Q: Is this the same as the Application ILM Business Unit?
A: The Informatica Data Security Group is a business unit that includes the former Application ILM business unit products comprised of data masking, data archive and test data management products from previous acquisitions, including Applimation, ActiveBase, and TierData, and additional resources developing and supporting Informatica’s data security products GTM, such as Secure@Source.
Q: How big is the Data Security market opportunity?
A: Data Security software market is estimated to be a $3B market in 2015 according to Gartner. Total information security spending will grow a further 8.2 percent in 2015 to reach $76.9 billion.
Q: Who would be most interested in this announcement and why?
A: All leaders are impacted when a data breach occurs. Understanding the risk of sensitive data is a board room topic. Informatica is investing and committing to securing and safeguarding sensitive, private and confidential data. If you are an existing customer, you will be able to leverage your existing skills on the Informatica platform to address a challenge facing every team who manages or handles sensitive or confidential data.
Q: How does this announcement impact the Application ILM products – Data Masking, Data Archive and Test Data Management?
A: The existing Application ILM products are foundational to the Data Security Group product portfolio. These products will continue to be invested in, supported and updated. We are building upon our success with the Data Masking, Data Archive and Test Data Management products.
Q: How will this change impact my customer experience?
A: The Informatica product website will reflect this new organization by listing the Data Masking, Data Archive, and Test Data Management products under the Data Security product category. The customer support portal will reference Data Security as the top level product category. Older versions of the product and corresponding documentation will not be updated and will continue to reflect Application ILM nomenclature and messaging.
Data has always played a key role in informing decisions – machine generated and intuitive. In the past, much of this data came from transactional databases as well as unstructured sources, such as emails and flat files. Mobile devices appeared next on the map. We have found applications of such devices not just to make calls but also to send messages, take a picture, and update status on social media sites. As a result, new sets of data got created from user engagements and interactions. Such data started to tell a story by connecting dots at different location points and stages of user connection. “Internet of Things” or IoT is the latest technology to enter the scene that could transform how we view and use data on a massive scale.
Does IoT present a significant opportunity for companies to transform their business processes? Internet of Things probably add an important awareness veneer when it comes to data. It could bring data early in focus by connecting every step of data creation stages in any business process. It could de-couple the lagging factor in consuming data and making decisions based on it. Data generated at every stage in a business process could show an interesting trend or pattern and better yet, tell a connected story. Result could be predictive maintenance of equipment involved in any process that would further reduce cost. New product innovations would happen by leveraging the connectedness in data as generated by each step in a business process. We would soon begin to understand not only where the data is being used and how, but also what’s the intent and context behind this usage. Organizations could then connect with their customers in a one-on-one fashion like never before, whether to promote a product or offer a promotion that could be both time and place sensitive. New opportunities to tailor product and services offering for customers on an individual basis would create new growth areas for businesses. Internet of Things could make it a possibility by bringing together previously isolated sets of data.
Recent Economist report, “The Virtuous Circle of Data: Engaging Employees in Data and Transforming Your Business” suggests that 68% of data-driven businesses outperform their competitors when it comes to profitability. 78% of those businesses foster a better culture of creativity and innovation. Report goes on to suggest that 3 areas are critical for an organization to build a data-driven business, including data supported by devices: 1) Technology & Tools, 2) Talent & Expertise, and 3) Culture & Leadership. By 2020, it’s projected that there’ll be 50B connected devices, 7x more than human beings on the planet. It is imperative for an organization to have a support structure in place for device generated data and a strategy to connect with broader enterprise-wide data initiatives.
A comprehensive Internet of Things strategy would leverage speed and context of data to the advantage of business process owners. Timely access to device generated data can open up the channels of communication to end-customers in a personalized at the moment of their readiness. It’s not enough anymore to know what customers may want or what they asked for in the past; rather anticipating what they might want by connecting dots across different stages. IoT generated data can help bridge this gap.
How to Manage IoT Generated Data
More data places more pressure on both quality and security factors – key building blocks for trust in one’s data. Trust is ideally truth over time. Consistency in data quality and availability is going to be key requirement for all organizations to introduce new products or service differentiated areas in a speedy fashion. Informatica’s Intelligent Data Platform or IDP brings together industry’s most comprehensive data management capabilities to help organizations manage all data, including device generated, both in the cloud and on premise. Informatica’s IDP enables an automated sensitive data discovery, such that data discovers users in the context where it’s needed.
Cool IoT Applications
There are a number of companies around the world that are working on interesting applications of Internet of Things related technology. Smappee from Belgium has launched an energy monitor that can itemize electricity usage and control a household full of devices by clamping a sensor around the main power cable. This single device can recognize individual signatures produced by each of the household devices and can let consumers switch off any device, such as an oven remotely via smartphone. JIBO is a IoT device that’s touted as the world’s first family robot. It automatically uploads data in the cloud of all interactions. Start-ups such as Roost and Range OI can retrofit older devices with Internet of Things capabilities. One of the really useful IoT applications could be found in Jins Meme glasses and sunglasses from Japan. They embed wearable sensors that are shaped much like Bluetooth headsets to detect drowsiness in its wearer. It observes the movement of eyes and blinking frequency to identify tiredness or bad posture and communicate via iOS and android smartphone app. Finally, Mellow is a new kind of kitchen robot that makes it easier by cooking ingredients to perfection while someone is away from home. Mellow is a sous-vide machine that takes orders through your smartphone and keeps food cold until it’s the exact time to start cooking.
Each of the application mentioned above deals with data, volumes of data, in real-time and in stored fashion. Such data needs to be properly validated, cleansed, and made available at the moment of user engagement. In addition to Informatica’s Intelligent Data Platform, newly introduced Informatica’s Rev product can truly connect data coming from all sources, including IoT devices and make it available for everyone. What opportunity does IoT present to your organization? Where are the biggest opportunities to disrupt the status quo?
I recently read an opinion piece written in an insurance publication online. The author postulated, among other things, that the Internet of Things would magically deliver great data to an insurer. Yes, it was a statement just that glib. Almost as if there is some fantastic device that you just plug into the wall and out streams a flow of unicorns and rainbows. And furthermore that those unicorns and rainbows will subsequently give a magical boost to your business. But hey, you plugged in that fantastic device, so bring on the magic.
Now, let’s come back from the land of fairytales and ground ourselves in reality. Data is important, no doubt about that. Today, financial services firms are able to access data from so many new data sources. One of those new and fancy data sources is the myriad of devices in this thing we call the Internet of Things.
You ever have one of those frustrating days with your smart phone? Dropped calls, slow Internet, Facebook won’t locate you? Well, other devices experience the same wonkiness. Even the most robust of devices found on commercial aircraft or military equipment are not lossless in data transmission. And that’s where we are with the Internet of Things. All great devices, they serve a number of purposes, but are still fallible in communicating with the “mother ship”.
A telematics device in a consumer vehicle can transmit, VIN, speed, latitude/longitude, time, and other vehicle statuses for use in auto insurance. As with other devices on a network, some of these data elements will not come through reliably. That means that in order to reconstruct or smooth the set of data, interpolations need to be made and/or entire entries deleted as useless. That is the first issue. Second, simply receiving this isolated dataset does not make sense of it. The data needs to be moved, cleansed and then correlated to other pieces of the puzzle, which eventually turn into a policyholder, an account holder, a client or a risk. And finally, that enhanced data can be used for further analytics. It can be archived, aggregated, warehoused and secured for additional analysis. None of these activities happen magically. And the sheer volume of integration points and data requires a robust and standardized data management infrastructure.
So no, just having an open channel to the stream of noise from your local Internet of Things will not magically deliver you great data. Great data comes from market leading data management solutions from Informatica. So whether you are an insurance company, financial services firm or data provider, being “Insurance Ready” means having great data; ready to use; everywhere…from Informatica.
In last 50-60 years, we have witnessed another revolution, through the invention of computing machines and the Internet – a digital revolution. It has transformed every industry and allowed us to operate at far greater scale – processing more transactions and in more locations – than ever before. New cities emerged on the map, migrations of knowledge workers throughout the world followed, and the standard of living increased again. And digitally available information transformed how we run businesses, cities, or countries.
Forces Shaping Digital Revolution
Over the last 5-6 years, we’ve witnessed a massive increase in the volume and variety of this information. Leading forces that contributed to this increase are:
- Next generation of software technology connecting data faster from any source
- Little to no hardware cost to process and store huge amount of data (Moore’s Law)
- A sharp increase in number of machines and devices generating data that are connected online
- Massive worldwide growth of people connecting online and sharing information
- Speed of Internet connectivity that’s now free in many public places
As a result, our engagement with the digital world is rising – both for personal and business purposes. Increasingly, we play games, shop, sign digital contracts, make product recommendations, respond to customer complains, share patient data, and make real time pricing changes to in-store products – all from a mobile device or laptop. We do so increasingly in a collaborative way, in real-time, and in a very personalized fashion. Big Data, Social, Cloud, and Internet of Things are key topics dominating our conversations and thoughts around data these days. They are altering our ways to engage with and expectations from each other.
This is the emergence of a new revolution or it is the next phase of our digital revolution – the democratization and ubiquity of information to create new ways of interacting with customers and dramatically speeding up market launch. Businesses will build new products and services and create new business models by exploiting this vast new resource of information.
The Quest for Great Data
But, there is work to do before one can unleash the true potential captured in data. Data is no more a by-product or transaction record. Neither it has anymore an expiration date. Data now flows through like a river fueling applications, business processes, and human or machine activities. New data gets created on the way and augments our understanding of the meaning behind this data. It is no longer good enough to have good data in isolated projects, but rather great data need to become accessible to everyone and everything at a moment’s notice. This rich set of data needs to connect efficiently to information that has been already present and learn from it. Such data need to automatically rid itself of inaccurate and incomplete information. Clean, safe, and connected – this data is now ready to find us even before we discover it. It understands the context in which we are going to make use of this information and key decisions that will follow. In the process, this data is learning about our usage, preference, and results. What works versus what doesn’t. New data is now created that captures such inherent understanding or intelligence. It needs to flow back to appropriate business applications or machines for future usage after fine-tuning. Such data can then tell a story about human or machine actions and results. Such data can become a coach, a mentor, a friend of kind to guide us through critical decision points. Such data is what we would like to call great data. In order to truly capitalize on the next step of digital revolution, we will pervasively need this great data to power our decisions and thinking.
Impacting Every Industry
By 2020, there’ll be 50 Billion connected devices, 7x more than human beings on the planet. With this explosion of devices and associated really big data that will be processed and stored increasingly in the cloud. More than size, this complexity will require a new way of addressing business process efficiency that renders agility, simplicity, and capacity. Impact of such transformation will spread across many industries. A McKinsey article, “The Future of Global Payments”, focuses on digital transformation of payment systems in the banking industry and ubiquity of data as a result. One of the key challenges for banks will be to shift from their traditional heavy reliance on siloed and proprietary data to a more open approach that encompasses a broader view of customers.
Industry executives, front line managers, and back office workers are all struggling to make the most sense of the data that’s available.
Closing Thoughts on Great Data
A “2014 PWC Global CEO Survey ” showed 81% ranked technology advances as #1 factor to transform their businesses over next 5 years. More data, by itself, isn’t enough for this transformation. A robust data management approach integrating machine and human data, from all sources and updated in real-time, among on-premise and cloud-based systems must be put in place to accomplish this mission. Such an approach will nurture great data. This end-to-end data management platform will provide data guidance and curate an organization’s one of the most valuable assets, its information. Only by making sense of what we have at our disposal, will we unleash the true potential of the information that we possess. The next step in the digital revolution will be about organizations of all sizes being fueled by great data to unleash their potential tapped.
“Raw materials costs are the company’s single largest expense category,” said Steve Jenkins, Global IT Director at Valspar, at MDM Day in London. “Data management technology can help us improve business process efficiency, manage sourcing risk and reduce RFQ cycle times.”
Valspar is a $4 billion global manufacturing company, which produces a portfolio of leading paint and coating brands. At the end of 2013, the 200 year old company celebrated record sales and earnings. They also completed two acquisitions. Valspar now has 10,000 employees operating in 25 countries.
As is the case for many global companies, growth creates complexity. “Valspar has multiple business units with varying purchasing practices. We source raw materials from 1,000s of vendors around the globe,” shared Steve.
“We want to achieve economies of scale in purchasing to control spending,” Steve said as he shared Valspar’s improvement objectives. “We want to build stronger relationships with our preferred vendors. Also, we want to develop internal process efficiencies to realize additional savings.”
Poorly managed vendor and raw materials data was impacting Valspar’s buying power
The Valspar team, who sharply focuses on productivity, had an “Aha” moment. “We realized our buying power was limited by the age and quality of available vendor data and raw materials data,” revealed Steve.
The core vendor data and raw materials data that should have been the same across multiple systems wasn’t. Data was often missing or wrong. This made it difficult to calculate the total spend on raw materials. It was also hard to calculate the total cost of expedited freight of raw materials. So, employees used a manual, time-consuming and error-prone process to consolidate vendor data and raw materials data for reporting.
These data issues were getting in the way of achieving their improvement objectives. Valspar needed a data management solution.
Valspar needed a single trusted source of vendor and raw materials data
The team chose Informatica MDM, master data management (MDM) technology. It will be their enterprise hub for vendors and raw materials. It will manage this data centrally on an ongoing basis. With Informatica MDM, Valspar will have a single trusted source of vendor and raw materials data.
Informatica PowerCenter will access data from multiple source systems. Informatica Data Quality will profile the data before it goes into the hub. Then, after Informatica MDM does it’s magic, PowerCenter will deliver clean, consistent, connected and enriched data to target systems.
Better vendor and raw materials data management results in cost savings
Valspar expects to gain the following business benefits:
- Streamline the RFQ process to accelerate raw materials cost savings
- Reduce the total number of raw materials SKUs and vendors
- Increase productivity of staff focused on pulling and maintaining data
- Leverage consistent global data visibly to:
- increase leverage during contract negotiations
- improve acquisition due diligence reviews
- facilitate process standardization and reporting
Valspar’s vision is to tranform data and information into a trusted organizational assets
“Mastering vendor and raw materials data is Phase 1 of our vision to transform data and information into trusted organizational assets,” shared Steve. In Phase 2 the Valspar team will master customer data so they have immediate access to the total purchases of key global customers. In Phase 3, Valspar’s team will turn their attention to product or finished goods data.
Steve ended his presentation with some advice. “First, include your business counterparts in the process as early as possible. They need to own and drive the business case as well as the approval process. Also, master only the vendor and raw materials attributes required to realize the business benefit.”
Want more? Download the Total Supplier Information Management eBook. It covers:
- Why your fragmented supplier data is holding you back
- The cost of supplier data chaos
- The warning signs you need to be looking for
- How you can achieve Total Supplier Information Management
Last time I talked about how benchmark data can be used in IT and business use cases to illustrate the financial value of data management technologies. This time, let’s look at additional use cases, and at how to philosophically interpret the findings.
So here are some additional areas of investigation for justifying a data quality based data management initiative:
- Compliance or any audits data and report preparation and rebuttal (FTE cost as above)
- Excess insurance premiums on incorrect asset or party information
- Excess tax payments due to incorrect asset configuration or location
- Excess travel or idle time between jobs due to incorrect location information
- Excess equipment downtime (not revenue generating) or MTTR due to incorrect asset profile or misaligned reference data not triggering timely repairs
- Equipment location or ownership data incorrect splitting service cost or revenues incorrectly
- Party relationship data not tied together creating duplicate contacts or less relevant offers and lower response rates
- Lower than industry average cross-sell conversion ratio due to inability to match and link departmental customer records and underlying transactions and expose them to all POS channels
- Lower than industry average customer retention rate due to lack of full client transactional profile across channels or product lines to improve service experience or apply discounts
- Low annual supplier discounts due to incorrect or missing alternate product data or aggregated channel purchase data
I could go on forever, but allow me to touch on a sensitive topic – fines. Fines, or performance penalties by private or government entities, only make sense to bake into your analysis if they happen repeatedly in fairly predictable intervals and are “relatively” small per incidence. They should be treated like M&A activity. Nobody will buy into cost savings in the gazillions if a transaction only happens once every ten years. That’s like building a business case for a lottery win or a life insurance payout with a sample size of a family. Sure, if it happens you just made the case but will it happen…soon?
Use benchmarks and ranges wisely but don’t over-think the exercise either. It will become paralysis by analysis. If you want to make it super-scientific, hire an expensive consulting firm for a 3 month $250,000 to $500,000 engagement and have every staffer spend a few days with them away from their day job to make you feel 10% better about the numbers. Was that worth half a million dollars just in 3rd party cost? You be the judge.
In the end, you are trying to find out and position if a technology will fix a $50,000, $5 million or $50 million problem. You are also trying to gauge where key areas of improvement are in terms of value and correlate the associated cost (higher value normally equals higher cost due to higher complexity) and risk. After all, who wants to stand before a budget committee, prophesy massive savings in one area and then fail because it would have been smarter to start with something simpler and quicker win to build upon?
The secret sauce to avoiding this consulting expense and risk is a natural curiosity, willingness to do the legwork of finding industry benchmark data, knowing what goes into them (process versus data improvement capabilities) to avoid inappropriate extrapolation and using sensitivity analysis to hedge your bets. Moreover, trust an (internal?) expert to indicate wider implications and trade-offs. Most importantly, you have to be a communicator willing to talk to many folks on the business side and have criminal interrogation qualities, not unlike in your run-of-the-mill crime show. Some folks just don’t want to talk, often because they have ulterior motives (protecting their legacy investment or process) or hiding skeletons in the closet (recent bad performance). In this case, find more amenable people to quiz or pry the information out of these tough nuts, if you can.
Lastly; if you find ROI numbers, which appear astronomical at first, remember that leverage is a key factor. If a technical capability touches one application (credit risk scoring engine), one process (quotation), one type of transaction (talent management self-service), a limited set of people (procurement), the ROI will be lower than a technology touching multiple of each of the aforementioned. If your business model drives thousands of high-value (thousands of dollars) transactions versus ten twenty-million dollar ones or twenty-million one-dollar ones, your ROI will be higher. After all, consider this; retail e-mail marketing campaigns average an ROI of 578% (softwareprojects.com) and this with really bad data. Imagine what improved data can do just on that front.
I found massive differences between what improved asset data can deliver in a petrochemical or utility company versus product data in a fashion retailer or customer (loyalty) data in a hospitality chain. The assertion of cum hoc ergo propter hoc is a key assumption how technology delivers financial value. As long as the business folks agree or can fence in the relationship, you are on the right path.
What’s your best and worst job to justify someone giving you money to invest? Share that story.
Malcolm Gladwell wrote an article in The New Yorker magazine in January, 2007 entitled “Open Secrets.” In the article, he pointed out that a national-security expert had famously made a distinction between puzzles and mysteries.
Osama bin Laden’s whereabouts were, for many years, a puzzle. We couldn’t find him because we didn’t have enough information. The key to the puzzle, it was assumed, would eventually come from someone close to bin Laden, and until we could find that source, bin Laden would remain at large. In fact, that’s precisely what happened. Al-Qaida’s No. 3 leader, Khalid Sheikh Mohammed, gave authorities the nicknames of one of bin Laden’s couriers, who then became the linchpin to the CIA’s efforts to locate Bin Laden.
By contrast, the problem of what would happen in Iraq after the toppling of Saddam Hussein was a mystery. It wasn’t a question that had a simple, factual answer. Mysteries require judgments and the assessment of uncertainty, and the hard part is not that we have too little information but that we have too much.
This was written before “Big Data” was a household word and it begs the very interesting question of whether organizations and corporations that are, by anyone’s standards, totally deluged with data, are facing puzzles or mysteries. Consider the amount of data that a company like Western Union deals with.
Western Union is a 160-year old company. Having built scale in the money transfer business, the company is in the process of evolving its business model by enabling the expansion of digital products, growth of web and mobile channels, and a more personalized online customer experience. Sounds good – but get this: the company processes more than 29 transactions per seconds on average. That’s 242 million consumer-to-consumer transactions and 459 million business payments in a year. Nearly a billion transactions – a billion! As my six-year-old might say, that number is big enough “to go to the moon and back.” Layer on top of that the fact that the company operates in 200+ countries and territories, and conducts business in 120+ currencies. Senior Director and Head of Engineering Abhishek Banerjee has said, “The data is speaking to us. We just need to react to it.” That implies a puzzle, not a mystery – but only if data scientists are able to conduct statistical modeling and predictive analysis, systematically noting trends in sending and receiving behaviors. Check out what Banerjee and Western Union CTO Sanjay Saraf have to say about it here.
Or consider General Electric’s aggressive and pioneering move into what’s dubbed as the industrial internet. In a white paper entitled “The Case for an Industrial Big Data Platform: Laying the Groundwork for the New Industrial Age,” GE reveals some of the staggering statistics related to the industrial equipment that it manufactures and supports (services comprise 75% of GE’s bottom line):
- A modern wind turbine contains approximately 50 sensors and control loops which collect data every 40 milliseconds.
- A farm controller then receives more than 30 signals from each turbine at 160-millisecond intervals.
- At every one-second interval, the farm monitoring software processes 200 raw sensor data points with various associated properties with each turbine.
Phew! I’m no electricity operations expert, and you probably aren’t either. And most of us will get no further than simply wrapping our heads around the simple fact that GE turbines are collecting a LOT of data. But what the paper goes on to say should grab your attention in a big way: “The key to success for this wind farm lies in the ability to collect and deliver the right data, at the right velocity, and in the right quantities to a wide set of well-orchestrated analytics.” And the paper goes on to recommend that anyone involved in the Industrial Internet revolution strongly consider its talent requirements, with the suggestion that Chief Data officers and/or Data Scientists may be the next critical hires.
Which brings us back to Malcolm Gladwell. In the aforementioned article, Gladwell goes on to pull apart the Enron debacle, and argues that it was a prime example of the perils of too much information. “If you sat through the trial of (former CEO) Jeffrey Skilling, you’d think that the Enron scandal was a puzzle. The company, the prosecution said, conducted shady side deals that no one quite understood. Senior executives withheld critical information from investors…We were not told enough—the classic puzzle premise—was the central assumption of the Enron prosecution.” But in fact, that was not true. Enron employed complicated – but perfectly legal–accounting techniques used by companies that engage in complicated financial trading. Many journalists and professors have gone back and looked at the firm’s regulatory filings, and have come to the conclusion that, while complex and difficult to identify, all of the company’s shenanigans were right there in plain view. Enron cannot be blamed for covering up the existence of its side deals. It didn’t; it disclosed them. As Gladwell summarizes:
“Puzzles are ‘transmitter-dependent’; they turn on what we are told. Mysteries are ‘receiver dependent’; they turn on the skills of the listener.”
I would argue that this extremely complex, fast moving and seismic shift that we call Big Data will favor those who have developed the ability to attune, to listen and make sense of the data. Winners in this new world will recognize what looks like an overwhelming and intractable mystery, and break that mystery down into small and manageable chunks and demystify the landscape, to uncover the important nuggets of truth and significance.
A mid-sized insurer recently approached our team for help. They wanted to understand how they fell short in making their case to their executives. Specifically, they proposed that fixing their customer data was key to supporting the executive team’s highly aggressive 3-year growth plan. (This plan was 3x today’s revenue). Given this core organizational mission – aside from being a warm and fuzzy place to work supporting its local community – the slam dunk solution to help here is simple. Just reducing the data migration effort around the next acquisition or avoiding the ritual annual, one-off data clean-up project already pays for any tool set enhancing data acquisitions, integration and hygiene. Will it get you to 3x today’s revenue? It probably won’t. What will help are the following:
Hard cost avoidance via software maintenance or consulting elimination is the easy part of the exercise. That is why CFOs love it and focus so much on it. It is easy to grasp and immediate (aka next quarter).
Soft cost reduction, like staff redundancies are a bit harder. Despite them being viable, in my experience very few decision makers want work on a business case to lay off staff. My team had one so far. They look at these savings as freed up capacity, which can be re-deployed more productively. Productivity is also a bit harder to quantify as you typically have to understand how data travels and gets worked on between departments.
However, revenue effects are even harder and esoteric to many people as they include projections. They are often considered “soft” benefits, although they outweigh the other areas by 2-3 times in terms of impact. Ultimately, every organization runs their strategy based on projections (see the insurer in my first paragraph).
The hardest to quantify is risk. Not only is it based on projections – often from a third party (Moody’s, TransUnion, etc.) – but few people understand it. More often, clients don’t even accept you investigating this area if you don’t have an advanced degree in insurance math. Nevertheless, risk can generate extra “soft” cost avoidance (beefing up reserve account balance creating opportunity cost) but also revenue (realizing a risk premium previously ignored). Often risk profiles change due to relationships, which can be links to new “horizontal” information (transactional attributes) or vertical (hierarchical) from parent-child relationships of an entity and the parent’s or children’s transactions.
Given the above, my initial advice to the insurer would be to look at the heartache of their last acquisition, use a benchmark for IT productivity from improved data management capabilities (typically 20-26% – Yankee Group) and there you go. This is just the IT side so consider increasing the upper range by 1.4x (Harvard Business School) as every attribute change (last mobile view date) requires additional meetings on a manager, director and VP level. These people’s time gets increasingly more expensive. You could also use Aberdeen’s benchmark of 13hrs per average master data attribute fix instead.
You can also look at productivity areas, which are typically overly measured. Let’s assume a call center rep spends 20% of the average call time of 12 minutes (depending on the call type – account or bill inquiry, dispute, etc.) understanding
- Who the customer is
- What he bought online and in-store
- If he tried to resolve his issue on the website or store
- How he uses equipment
- What he cares about
- If he prefers call backs, SMS or email confirmations
- His response rate to offers
- His/her value to the company
If he spends these 20% of every call stringing together insights from five applications and twelve screens instead of one frame in seconds, which is the same information in every application he touches, you just freed up 20% worth of his hourly compensation.
Then look at the software, hardware, maintenance and ongoing management of the likely customer record sources (pick the worst and best quality one based on your current understanding), which will end up in a centrally governed instance. Per DAMA, every duplicate record will cost you between $0.45 (party) and $0.85 (product) per transaction (edit touch). At the very least each record will be touched once a year (likely 3-5 times), so multiply your duplicated record count by that and you have your savings from just de-duplication. You can also use Aberdeen’s benchmark of 71 serious errors per 1,000 records, meaning the chance of transactional failure and required effort (% of one or more FTE’s daily workday) to fix is high. If this does not work for you, run a data profile with one of the many tools out there.
If standardization of records (zip codes, billing codes, currency, etc.) is the problem, ask your business partner how many customer contacts (calls, mailing, emails, orders, invoices or account statements) fail outright and/or require validation because of these attributes. Once again, if you apply the productivity gains mentioned earlier, there are you savings. If you look at the number of orders that get delayed in form of payment or revenue recognition and the average order amount by a week or a month, you were just able to quantify how much profit (multiply by operating margin) you would be able to pull into the current financial year from the next one.
The same is true for speeding up the introduction or a new product or a change to it generating profits earlier. Note that looking at the time value of funds realized earlier is too small in most instances especially in the current interest environment.
If emails bounce back or snail mail gets returned (no such address, no such name at this address, no such domain, no such user at this domain), e(mail) verification tools can help reduce the bounces. If every mail piece (forget email due to the miniscule cost) costs $1.25 – and this will vary by type of mailing (catalog, promotion post card, statement letter), incorrect or incomplete records are wasted cost. If you can, use fully loaded print cost incl. 3rd party data prep and returns handling. You will never capture all cost inputs but take a conservative stab.
If it was an offer, reduced bounces should also improve your response rate (also true for email now). Prospect mail response rates are typically around 1.2% (Direct Marketing Association), whereas phone response rates are around 8.2%. If you know that your current response rate is half that (for argument sake) and you send out 100,000 emails of which 1.3% (Silverpop) have customer data issues, then fixing 81-93% of them (our experience) will drop the bounce rate to under 0.3% meaning more emails will arrive/be relevant. This in turn multiplied by a standard conversion rate (MarketingSherpa) of 3% (industry and channel specific) and average order (your data) multiplied by operating margin gets you a benefit value for revenue.
If product data and inventory carrying cost or supplier spend are your issue, find out how many supplier shipments you receive every month, the average cost of a part (or cost range), apply the Aberdeen master data failure rate (71 in 1,000) to use cases around lack of or incorrect supersession or alternate part data, to assess the value of a single shipment’s overspend. You can also just use the ending inventory amount from the 10-k report and apply 3-10% improvement (Aberdeen) in a top-down approach. Alternatively, apply 3.2-4.9% to your annual supplier spend (KPMG).
You could also investigate the expediting or return cost of shipments in a period due to incorrectly aggregated customer forecasts, wrong or incomplete product information or wrong shipment instructions in a product or location profile. Apply Aberdeen’s 5% improvement rate and there you go.
Consider that a North American utility told us that just fixing their 200 Tier1 suppliers’ product information achieved an increase in discounts from $14 to $120 million. They also found that fixing one basic out of sixty attributes in one part category saves them over $200,000 annually.
So what ROI percentages would you find tolerable or justifiable for, say an EDW project, a CRM project, a new claims system, etc.? What would the annual savings or new revenue be that you were comfortable with? What was the craziest improvement you have seen coming to fruition, which nobody expected?
Next time, I will add some more “use cases” to the list and look at some philosophical implications of averages.