Category Archives: Data Integration
This blog post initially appeared on Exterro and is reblogged here with their consent.
As data volumes increase and become more complex, having an integrated e-discovery environment where systems and data sources are automatically synching information and exchanging data with e-discovery applications has become critical for organizations. This is true of unstructured and semi-structured data sources, such as email servers and content management systems, as well as structured data sources, like databases and data archives. The topic of systems integration will be discussed on Exterro’s E-Discovery Masters series webcast, “Optimizing E-Discovery in a Multi-Vendor Environment.” The webcast is CLE-accredited and will air on Thursday, September 4 at 1pm ET/10am PT. Learn more and register here.
I recently interviewed Jim FitzGerald, Sr. Director for Exterro, and Josh Alpern, VP ILM Domain Experts for Informatica, about the important and often overlooked role structured data plays during the course of e-discovery.
Q: E-Discovery demands are often discussed in the context of unstructured data, like email. What are some of the complications that arise when a matter involves structured data?
Jim: A lot of e-discovery practitioners are comfortable with unstructured data sources like email, file shares, or documents in SharePoint, but freeze up when they have to deal with structured data. They are unfamiliar with the technology and terminology of databases, extracts, report generation, and archives. They’re unsure about the best ways to preserve or collect from these sources. If the application is an old one, this fear often gets translated into a mandate to keep everything just as it is, which translates to mothballed applications that just sit there in case data might be needed down the road. Beyond the costs, there’s also the issue that IT staff turnover means that it’s increasingly hard to generate the reports Legal and Compliance need from these old systems.
Josh: Until now, e-discovery has largely been applied to unstructured data and email for two main reasons: 1) a large portion of relevant data resides in these types of data stores, and 2) these are the data formats that everyone is most familiar with and can relate to most easily. We all use email, and we all use files and documents. So it’s easy for people to look at an email or a document and understand that everything is self-contained in that one “thing.” But structured data is different, although not necessarily any less relevant. For example, someone might understand conceptually what a “purchase order” is, but not realize that in a financial application a purchase order consists of data that is spread across 50 different database tables. Unlike with an email or a PDF document, there might not be an easy way to simply produce a purchase order, in this example, for legal discovery without understanding how those 50 database tables are related to each other. Furthermore, to use email as a comparison, everyone understands what an email “thread” is. It’s easy to ask for all the emails in single thread, and usually it’s relatively easy to identify all of those emails: they all have the same subject line. But in structured data the situation can be much more complicated. If someone asks to see every financial document related to a single purchase order, you would have to understand all of the connections between the many database tables that comprise all of those related documents and how they related back to the requested purchase order. Solutions that are focused on email and unstructured data have no means to do this.
Q: What types of matters tend to implicate structured data and are they becoming more or less common?
Jim: The ones I hear about most common are product liability cases where they need to look back at warranty claims or drug trial data, or employment disputes around pay history and practices, or financial cases where they need to look at pricing or trading patterns.
Josh: The ones that Jim mentioned are certainly prevalent. But in addition, I would add that all kinds of financial data are now governed by retention policies largely because of the same concerns that arise from potential legal situations: at some point, someone may ask for it. Anything related to consumer packaged goods, vehicle parts (planes, boats, cars, trucks, etc.) as well as industrial and durable goods, which tend to have very long lifecycles, are increasingly subject to these types of inquiries.
Q: Simply accessing legacy data to determine its relevance to a matter can present significant challenges. What are some methods by which organizations can streamline the process?
Jim: If you are keeping around mothballed applications and databases purely for reporting purposes, these are prime targets to migrate to a structured data archive. Cost savings from licenses, CPU, and storage can run to 65% per year, with the added benefit that it’s much easier to enforce a retention policy on this data, roll it off when it expires, and compliance reporting is easier to do with modern tools.
Josh: One huge challenge that comes from these legacy applications stems from the fact that there are typically a lot of them. That means that when a discovery request arises, someone – or more likely multiple people – have to go to each one of those applications one by one to search for and retrieve relevant data. Not only is that time consuming and cumbersome, but it also assumes that there are people with the skill sets and application knowledge necessary to interact with all of those different applications. In any given company, that might not be a problem *today*, shortly after the applications have been decommissioned, because all the people that used the applications when they were live are still around. But will that still be the case 5, 7, 10 or 20 years from now? Probably not. Retiring all of these legacy applications into a “platform neutral” format is a much more sustainable, not to mention cost effective, approach.
Q: How can e-discovery preservation and collection technologies be leveraged to help organizations identify and “lock down” structured data?
Jim: Integrating e-discovery — legal holds and collections — with your structured data archive can make it a lot easier to coordinate preservation and collection activities across the two systems. This reduces the chances of stranded holds — data under preservation that could have been released, and reduces the ambiguity about what needs to happen to the data to support the needs of legal and compliance teams.
Josh: Just as there are solutions for “locking down” unstructured and semi-structured (email) data, there are solutions for locking down structured data. The first and perhaps most important step is recognizing that the solutions for unstructured and semi-structured data are simply incapable of handling structured data. Without something that is purpose built for structured data, your discovery preservation and collection process is going to ignore this entire category of data. The good news is that some of the solutions that are purpose built for structured data have built in integrations to the leading e-discovery platforms.
You can hear more from Informatica’s Josh Alpern and Exterro’s Jim FitzGerald by attending Exterro’s CLE-accredited webcast, “Optimizing E-Discovery in a Multi-Vendor Environment,” airing on Thursday, September 4. Learn more and register here.
Get connected. Be connected. Make connections. Find connections. The Internet of Things (IoT) is all about connecting people, processes, data and, as the name suggests, things. The recent social media frenzy surrounding the ALS Ice Bucket Challenge has certainly reminded everyone of the power of social media, the Internet and a willingness to answer a challenge. Fueled by personal and professional connections, the craze has transformed fund raising for at least one charity. Similarly, IoT may potentially be transformational to the business of the public sector, should government step up to the challenge.
Government is struggling with the concept and reality of how IoT really relates to the business of government, and perhaps rightfully so. For commercial enterprises, IoT is far more tangible and simply more fun. Gaming, televisions, watches, Google glasses, smartphones and tablets are all about delivering over-the-top, new and exciting consumer experiences. Industry is delivering transformational innovations, which are connecting people to places, data and other people at a record pace.
It’s time to accept the challenge. Government agencies need to keep pace with their commercial counterparts and harness the power of the Internet of Things. The end game is not to deliver new, faster, smaller, cooler electronics; the end game is to create solutions that let devices connecting to the Internet interact and share data, regardless of their location, manufacturer or format and make or find connections that may have been previously undetectable. For some, this concept is as foreign or scary as pouring ice water over their heads. For others, the new opportunity to transform policy, service delivery, leadership, legislation and regulation is fueling a transformation in government. And it starts with one connection.
One way to start could be linking previously siloed systems together or creating a golden record of all citizen interactions through a Master Data Management (MDM) initiative. It could start with a big data and analytics project to determine and mitigate risk factors in education or linking sensor data across multiple networks to increase intelligence about potential hacking or breaches. Agencies could stop waste, fraud and abuse before it happens by linking critical payment, procurement and geospatial data together in real time.
This is the Internet of Things for government. This is the challenge. This is transformation.
You probably know this already, but I’m going to say it anyway: It’s time you changed your infrastructure. I say this because most companies are still running infrastructure optimized for ERP, CRM and other transactional systems. That’s all well and good for running IT-intensive, back-office tasks. Unfortunately, this sort of infrastructure isn’t great for today’s business imperatives of mobility, cloud computing and Big Data analytics.
Virtually all of these imperatives are fueled by information gleaned from potentially dozens of sources to reveal our users’ and customers’ activities, relationships and likes. Forward-thinking companies are using such data to find new customers, retain existing ones and increase their market share. The trick lies in translating all this disparate data into useful meaning. And to do that, IT needs to move beyond focusing solely on transactions, and instead shine a light on the interactions that matter to their customers, their products and their business processes.
They need what we at Informatica call a “Data First” perspective. You can check out my first blog first about being Data First here.
A Data First POV changes everything from product development, to business processes, to how IT organizes itself and —most especially — the impact IT has on your company’s business. That’s because cloud computing, Big Data and mobile app development shift IT’s responsibilities away from running and administering equipment, onto aggregating, organizing and improving myriad data types pulled in from internal and external databases, online posts and public sources. And that shift makes IT a more-empowering force for business change. Think about it: The ability to connect and relate the dots across data from multiple sources finally gives you real power to improve entire business processes, departments and organizations.
I like to say that the role of IT is now “big I, little t,” with that lowercase “t” representing both technology and transactions. But that role requires a new set of priorities. They are:
- Think about information infrastructure first and application infrastructure second.
- Create great data by design. Architect for connectivity, cleanliness and security. Check out the eBook Data Integration for Dummies.
- Optimize for speed and ease of use – SaaS and mobile applications change often. Click here to try Informatica Cloud for free for 30 days.
- Make data a team sport. Get tools into your users’ hands so they can prepare and interact with it.
I never said this would be easy, and there’s no blueprint for how to go about doing it. Still, I recognize that a little guidance will be helpful. In a few weeks, Informatica’s CIO Eric Johnson and I will talk about how we at Informatica practice what we preach.
Informatica Cloud Summer ’14 Release Breaks Down Barriers with Unified Data Integration and Application Integration for Real Time and Bulk Patterns
This past week, Informatica Cloud marked an important milestone with the Summer 2014 release of the Informatica Cloud platform. This was the 20th Cloud release, and I am extremely proud of what our team has accomplished.
“SDL’s vision is to help our customers use data insights to create meaningful experiences, regardless of where or how the engagement occurs. It’s multilingual, multichannel and on a global scale. Being able to deliver the right information at the right time to the right customer with Informatica Cloud Summer 2014 is critical to our business and will continue to set us apart from our competition.”
– Paul Harris, Global Business Applications Director, SDL Pic
When I joined Informatica Cloud, I knew that it had the broadest cloud integration portfolio in the marketplace: leading data integration and analytic capabilities for bulk integration, comprehensive cloud master data management and test data management, and over a hundred connectors for cloud apps, enterprise systems and legacy data sources.. all delivered in a self-service design with point-and-click wizards for citizen integrators, without the need for complex and costly manual custom coding.
But, I also learned that our broad portfolio belies another structural advantage: because of Informatica Cloud’s unique, unified platform architecture, it has the ability to surface application (or real time) integration capabilities alongside its data integration capabilities with shared metadata across real time and batch workflows.
With the Summer 2014 release, we’ve brought our application integration capabilities to the forefront. We now provide the most-complete cloud app integration capability in the marketplace. With a design environment that’s meant not for just developers but also line of business IT, now app admins can also build real time process workflows that cut across on-premise and cloud and include built-in human workflows. And with the capability to translate these process workflows instantly into mobile apps for iPhone and Android mobile devices, we’re not just setting ourselves apart but also giving customers the unique capabilities they need for their increasingly mobile employees.
“Schneider’s strategic initiative to improve front-office performance relied on recording and measuring sales person engagement in real time on any mobile device or desktop. The enhanced real time cloud application integration features of Informatica Cloud Summer 2014 makes it all possible and was key to the success of a highly visible and transformative initiative.”
– Mark Nardella, Global Sales Process Director, Schneider Electric SE
With this release, we’re also giving customers the ability to create workflows around data sharing that mix and match batch and real time integration patterns. This is really important. Because unlike the past, where you had to choose between batch and real time, in today’s world of on-premise, cloud-based, transactional and social data, you’re now more than ever having to deal with both real time interactions and the processing of large volumes of data. For example, let’s surmise a typical scenario these days at high-end retail stores. Using a clienteling iPad app, the sales rep looks up bulk purchase history and inventory availability data in SAP, confirms availability and delivery date, and then processes the customer’s order via real time integration with NetSuite. And if you ask any customer, having a single workflow to unify all of that for instant and actionable insights is a huge advantage.
“Our industry demands absolute efficiency, speed and trust when dealing with financial information, and the new cloud application integration feature in the latest release of Informatica Cloud will help us service our customers more effectively by delivering the data they require in a timely fashion. Keeping call-times to a minimum and improving customer satisfaction in real time.”
– Kimberly Jansen, Director CRM, Misys PLC
We’ve also included some exciting new Vibe Integration packages or VIPs. VIPs deliver pre-built business process mappings between front-office and back-office applications. The Summer 2014 release includes new bidirectional VIPs for Siebel to Salesforce and SAP to Salesforce that make it easier for customers to connect their Salesforce with these mission-critical business applications.
And lastly, but not least importantly, the release includes a critical upgrade to our API Framework that provides the Informatica Cloud iPaaS end-to-end support for connectivity to any company’s internal or external APIs. With the newly available API creation, definition and consumption patterns, developers or citizen integrators can now easily expose integrations as APIs and users can consume them via integration workflows or apps, without the need for any additional custom code.
The features and capabilities released this summer are available to all existing Informatica Cloud customers, and everyone else through our free 30-day trial offer.
Since the survey was published, many enterprises have, indeed, leveraged the cloud to host business data in both IaaS and SaaS incarnations. Overall, there seems to be two types of enterprises: First are the enterprises that get the value of data integration. They leverage the value of cloud-based systems, and do not create additional data silos. Second are the enterprises that build cloud-based data silos without a sound data integration strategy, and thus take a few steps backward, in terms of effectively leveraging enterprise data.
There are facts about data integration that most in enterprise IT don’t yet understand, and the use of cloud-based resources actually makes things worse. The shame of it all is that, with a bit of work and some investment, the value should come back to the enterprises 10 to 20 times over. Let’s consider the facts.
Fact 1: Implement new systems, such as those being stood up on public cloud platforms, and any data integration investment comes back 10 to 20 fold. The focus is typically too much on cost and not enough on the benefit, when building a data integration strategy and investing in data integration technology.
Many in enterprise IT point out that their problem domain is unique, and thus their circumstances need special consideration. While I always perform domain-specific calculations, the patterns of value typically remain the same. You should determine the metrics that are right for your enterprise, but the positive values will be fairly consistent, with some varying degrees.
Fact 2: It’s not just about data moving from place-to-place, it’s also about the proper management of data. This includes a central understanding of data semantics (metadata), and a place to manage a “single version of the truth” when it comes to dealing massive amounts of distributed data that enterprises must typically manage, and now they are also distributed within public clouds.
Most of those who manage enterprise data, cloud or no-cloud, have no common mechanism to deal with the meaning of the data, or even the physical location of the data. While data integration is about moving data from place to place to support core business processes, it should come with a way to manage the data as well. This means understanding, protecting, governing, and leveraging the enterprise data, both locally and within public cloud providers.
Fact 3: Some data belongs on clouds, and some data belongs in the enterprise. Those in enterprise IT have either pushed back on cloud computing, stating that data outside the firewall is a bad idea due to security, performance, legal issues…you name it. Others try to move all data to the cloud. The point of value is somewhere in between.
The fact of the matter is that the public cloud is not the right fit for all data. Enterprise IT must carefully consider the tradeoff between cloud-based and in-house, including performance, security, compliance, etc.. Finding the best location for the data is the same problem we’ve dealt with for years. Now we have cloud computing as an option. Work from your requirements to the target platform, and you’ll find what I’ve found: Cloud is a fit some of the time, but not all of the time.
That second question is a killer because most people — no matter if they’re in marketing, sales or manufacturing — rely on incomplete, inaccurate or just plain wrong information. Regardless of industry, we’ve been fixated on historic transactions because that’s what our systems are designed to provide us.
“Moneyball: The Art of Winning an Unfair Game” gives a great example of what I mean. The book (not the movie) describes Billy Beane hiring MBAs to map out the factors that would win a baseball game. They discovered something completely unexpected: That getting more batters on base would tire out pitchers. It didn’t matter if batters had multi-base hits, and it didn’t even matter if they walked. What mattered was forcing pitchers to throw ball after ball as they faced an unrelenting string of batters. Beane stopped looking at RBIs, ERAs and even home runs, and started hiring batters who consistently reached first base. To me, the book illustrates that the most useful knowledge isn’t always what we’ve been programmed to depend on or what is delivered to us via one app or another.
For years, people across industries have turned to ERP, CRM and web analytics systems to forecast sales and acquire new customers. By their nature, such systems are transactional, forcing us to rely on history as the best predictor of the future. Sure it might be helpful for retailers to identify last year’s biggest customers, but that doesn’t tell them whose blogs, posts or Tweets influenced additional sales. Isn’t it time for all businesses, regardless of industry, to adopt a different point of view — one that we at Informatica call “Data-First”? Instead of relying solely on transactions, a data-first POV shines a light on interactions. It’s like having a high knowledge IQ about relationships and connections that matter.
A data-first POV changes everything. With it, companies can unleash the killer app, the killer sales organization and the killer marketing campaign. Imagine, for example, if a sales person meeting a new customer knew that person’s concerns, interests and business connections ahead of time? Couldn’t that knowledge — gleaned from Tweets, blogs, LinkedIn connections, online posts and transactional data — provide a window into the problems the prospect wants to solve?
That’s the premise of two startups I know about, and it illustrates how a data-first POV can fuel innovation for developers and their customers. Today, we’re awash in data-fueled things that are somehow attached to the Internet. Our cars, phones, thermostats and even our wristbands are generating and gleaning data in new and exciting ways. That’s knowledge begging to be put to good use. The winners will be the ones who figure out that knowledge truly is power, and wield that power to their advantage.
A mid-sized insurer recently approached our team for help. They wanted to understand how they fell short in making their case to their executives. Specifically, they proposed that fixing their customer data was key to supporting the executive team’s highly aggressive 3-year growth plan. (This plan was 3x today’s revenue). Given this core organizational mission – aside from being a warm and fuzzy place to work supporting its local community – the slam dunk solution to help here is simple. Just reducing the data migration effort around the next acquisition or avoiding the ritual annual, one-off data clean-up project already pays for any tool set enhancing data acquisitions, integration and hygiene. Will it get you to 3x today’s revenue? It probably won’t. What will help are the following:
Hard cost avoidance via software maintenance or consulting elimination is the easy part of the exercise. That is why CFOs love it and focus so much on it. It is easy to grasp and immediate (aka next quarter).
Soft cost reduction, like staff redundancies are a bit harder. Despite them being viable, in my experience very few decision makers want work on a business case to lay off staff. My team had one so far. They look at these savings as freed up capacity, which can be re-deployed more productively. Productivity is also a bit harder to quantify as you typically have to understand how data travels and gets worked on between departments.
However, revenue effects are even harder and esoteric to many people as they include projections. They are often considered “soft” benefits, although they outweigh the other areas by 2-3 times in terms of impact. Ultimately, every organization runs their strategy based on projections (see the insurer in my first paragraph).
The hardest to quantify is risk. Not only is it based on projections – often from a third party (Moody’s, TransUnion, etc.) – but few people understand it. More often, clients don’t even accept you investigating this area if you don’t have an advanced degree in insurance math. Nevertheless, risk can generate extra “soft” cost avoidance (beefing up reserve account balance creating opportunity cost) but also revenue (realizing a risk premium previously ignored). Often risk profiles change due to relationships, which can be links to new “horizontal” information (transactional attributes) or vertical (hierarchical) from parent-child relationships of an entity and the parent’s or children’s transactions.
Given the above, my initial advice to the insurer would be to look at the heartache of their last acquisition, use a benchmark for IT productivity from improved data management capabilities (typically 20-26% – Yankee Group) and there you go. This is just the IT side so consider increasing the upper range by 1.4x (Harvard Business School) as every attribute change (last mobile view date) requires additional meetings on a manager, director and VP level. These people’s time gets increasingly more expensive. You could also use Aberdeen’s benchmark of 13hrs per average master data attribute fix instead.
You can also look at productivity areas, which are typically overly measured. Let’s assume a call center rep spends 20% of the average call time of 12 minutes (depending on the call type – account or bill inquiry, dispute, etc.) understanding
- Who the customer is
- What he bought online and in-store
- If he tried to resolve his issue on the website or store
- How he uses equipment
- What he cares about
- If he prefers call backs, SMS or email confirmations
- His response rate to offers
- His/her value to the company
If he spends these 20% of every call stringing together insights from five applications and twelve screens instead of one frame in seconds, which is the same information in every application he touches, you just freed up 20% worth of his hourly compensation.
Then look at the software, hardware, maintenance and ongoing management of the likely customer record sources (pick the worst and best quality one based on your current understanding), which will end up in a centrally governed instance. Per DAMA, every duplicate record will cost you between $0.45 (party) and $0.85 (product) per transaction (edit touch). At the very least each record will be touched once a year (likely 3-5 times), so multiply your duplicated record count by that and you have your savings from just de-duplication. You can also use Aberdeen’s benchmark of 71 serious errors per 1,000 records, meaning the chance of transactional failure and required effort (% of one or more FTE’s daily workday) to fix is high. If this does not work for you, run a data profile with one of the many tools out there.
If standardization of records (zip codes, billing codes, currency, etc.) is the problem, ask your business partner how many customer contacts (calls, mailing, emails, orders, invoices or account statements) fail outright and/or require validation because of these attributes. Once again, if you apply the productivity gains mentioned earlier, there are you savings. If you look at the number of orders that get delayed in form of payment or revenue recognition and the average order amount by a week or a month, you were just able to quantify how much profit (multiply by operating margin) you would be able to pull into the current financial year from the next one.
The same is true for speeding up the introduction or a new product or a change to it generating profits earlier. Note that looking at the time value of funds realized earlier is too small in most instances especially in the current interest environment.
If emails bounce back or snail mail gets returned (no such address, no such name at this address, no such domain, no such user at this domain), e(mail) verification tools can help reduce the bounces. If every mail piece (forget email due to the miniscule cost) costs $1.25 – and this will vary by type of mailing (catalog, promotion post card, statement letter), incorrect or incomplete records are wasted cost. If you can, use fully loaded print cost incl. 3rd party data prep and returns handling. You will never capture all cost inputs but take a conservative stab.
If it was an offer, reduced bounces should also improve your response rate (also true for email now). Prospect mail response rates are typically around 1.2% (Direct Marketing Association), whereas phone response rates are around 8.2%. If you know that your current response rate is half that (for argument sake) and you send out 100,000 emails of which 1.3% (Silverpop) have customer data issues, then fixing 81-93% of them (our experience) will drop the bounce rate to under 0.3% meaning more emails will arrive/be relevant. This in turn multiplied by a standard conversion rate (MarketingSherpa) of 3% (industry and channel specific) and average order (your data) multiplied by operating margin gets you a benefit value for revenue.
If product data and inventory carrying cost or supplier spend are your issue, find out how many supplier shipments you receive every month, the average cost of a part (or cost range), apply the Aberdeen master data failure rate (71 in 1,000) to use cases around lack of or incorrect supersession or alternate part data, to assess the value of a single shipment’s overspend. You can also just use the ending inventory amount from the 10-k report and apply 3-10% improvement (Aberdeen) in a top-down approach. Alternatively, apply 3.2-4.9% to your annual supplier spend (KPMG).
You could also investigate the expediting or return cost of shipments in a period due to incorrectly aggregated customer forecasts, wrong or incomplete product information or wrong shipment instructions in a product or location profile. Apply Aberdeen’s 5% improvement rate and there you go.
Consider that a North American utility told us that just fixing their 200 Tier1 suppliers’ product information achieved an increase in discounts from $14 to $120 million. They also found that fixing one basic out of sixty attributes in one part category saves them over $200,000 annually.
So what ROI percentages would you find tolerable or justifiable for, say an EDW project, a CRM project, a new claims system, etc.? What would the annual savings or new revenue be that you were comfortable with? What was the craziest improvement you have seen coming to fruition, which nobody expected?
Next time, I will add some more “use cases” to the list and look at some philosophical implications of averages.
As adjunct university faculty, I get to talk to students about how business strategy increasingly depends upon understanding how to leverage information. To make discussion more concrete, I share with students the work of Alvin Toffler. In The Third Wave, Toffler asserts that we live in a world where competition will increasingly take place upon the currency and usability of information.
In a recent interview, Toffler said that “given the acceleration of change; companies, individuals, and governments base many of their daily decisions on obsoledge—knowledge whose shelf life has expired.” He continues by stating that “companies everywhere are trying to put a price on certain forms of intellectual property. But if…knowledge is at the core of the money economy, than we need to understand knowledge much better than we do now. And tiny insights can yield huge outputs”.
Driving better information management in the information age
To me, this drives to three salient conclusions for information age businesses:
- Information needs to drive further down organizations because top decision makers do not have the background to respond at the pace of change.
- Information needs to be available faster which means that we need to reducing the processing time for structure and unstructured information sources.
- Information needs to be available when the organization is ready for it. For multinational enterprises this means “Always On” 24/7 across multiple time zones on any device.
Effective managers today are effective managers of people and information
Effective managers today are effective managers of information. Because processing may take too much time, Toffler’s remarks suggest to me we need to consider human information—the ideas and communications we share every day—within the mix of getting access to the right information when it is needed and where it is needed. Now more than ever is the time for enterprises to ensure their decision makers have the timely information to make better business decisions when they are relevant. This means that unstructured data, a non-trivial majority of business information, needs to be made available to business users and related to existing structured sources of data.
Derick Abell says that “for (management) control to be effective, data must be timely and provided at interval that allows effective intervention”. Today this is a problem for most information businesses. As I see it, information optimization is the basis of powering the enterprise through “Third Wave” business competition. Organizations that have the “right to win” will have as a core capability better-than-class access to current information for decision makers.
Putting in place a winning information management strategy
If you talk to CIOs today, they will tell you that they are currently facing 4 major information age challenges.
- Mobility—Enabling their users to view data anytime, anyplace, and any device
- Information Trust—Making data dependable enough for business decisions as well as governing data across all business systems.
- Competing on Analytics—Getting information to business users fast enough to avoid Toffler’s Obsoledge.
- New and Big Data Sources—Connecting existing data to new value added sources of data.
Some information age
Lots of things, however, get in the way of delivering on the promises of the Information Age. Our current data architecture is siloed, fragile, and built upon layer after layer of spaghetti code integrations. Think about what is involved just to cobble together data on a company’s supply chain. A morass of structured data systems have vendor and transaction records locked up in application databases and data warehouses all over the extended enterprise. So it is not amazing that enterprises struggle to put together current, relevant data to run their businesses upon. Functions like finance depend largely upon manual extracts being massaged and integrated in spreadsheets because of concern over the quality of data being provided by financial systems. Some information age!
How do we connect to new sources of data?
At the same time, many are trying today to extend the information architecture to add social media data, mobile location data, and even machine data. Much of this data is not put together in the same way as data in an application database or data warehouse. However, being able to relate this data to existing data sources can yield significant benefits. Think about the potential benefit of being able to relate social interactions and mobile location data to sales data or to relate machine data to compliance data.
A big problem is many of these new data types potentially have even more data quality gaps than historical structured data systems. Often the signal to noise for this data can be very low for this reason. But this data can be invaluable to business decision making. For this reason, this data needs to be cleaned up and related to older data sources. Finally, it needs to be provided to business users in whatever manner they want to consume it.
How then do we fix the Information Age?
Enabling the kind of Information Age that Toffler imagined requires two things. Enterprises fix their data management and enable the information intelligence needed to drive real business competitive advantage. Fixing data management involves delivering good data that business users can safely make decisions from. It, also, involves ensuring that data once created is protected. CFOs that we have talked to say Target was a watershed event for them—something that they expect will receive more and more auditing attention.
We need at the same time to build the connection between old data sources and new data sources. And this needs to not take as long as in the past to connect data. Delivery needs to happen faster so business problems can be recognized and solved more quickly. Users need to get access to data when and where they need it.
With data management fixed, data intelligence needs to provide business users the ability to make sense out of things they find in the data. Business users need as well to be able to search and find data. They, also, need self-service so they can combine existing and new unstructured data sources to test data interrelationship hypothesis. This means the ability to assemble data and put it together and do it from different sources at different times. Simply put this is about data orchestration without any preconceived process. And lastly, business users need the intelligence to automatically sense and respond to changes as new data is collecting.
Tiny insights can yield huge outputs
Obviously, there is a cost to solving our information age issues, but it is important to remember what Toffler says. “Tiny insights can yield huge outputs”. In other words, the payoff is huge for shaking off the shackles of our early information age business architecture. And those that do this will increasingly have the “right to win” against their competitors as they use information to wring every last drop of value from their business processes.
Solution Brief: The Intelligent Data Platform
A few days ago, I came across a post, 5 C’s of MDM (Case, Content, Connecting, Cleansing, and Controlling), by Peter Krensky, Sr. Research Associate, Aberdeen Group and this response by Alan Duncan with his 5 C’s (Communicate, Co-operate, Collaborate, Cajole and Coerce). I like Alan’s list much better. Even though I work for a product company specializing in information management technology, the secret to successful enterprise information management (EIM) is in tackling the business and organizational issues, not the technology challenges. Fundamentally, data management at the enterprise level is an agreement problem, not a technology problem.
So, here I go with my 5 C’s: (more…)
I’ve “sold” data integration as a concept for the last 20 years. Let me tell you, it’s challenging to define the benefits to those who don’t work with this technology every day. That said, most of the complaints I hear about enterprise IT are around the lack of data integration, and thus the inefficiencies that go along with that lack, such as re-keying data, data quality issues, lack of automation across systems, and so forth.
Considering that most of you will sell data integration to your peers and leadership, I’ve come up with 3 proven ways to sell data integration internally.
First, focus on the business problems. Use real world examples from your own business. It’s not tough to find any number of cases where the data was just not there to make core operational decisions that could have avoided some huge mistakes that proved costly to the company. Or, more likely, there are things like ineffective inventory management that has no way to understand when orders need to be place. Or, there’s the go-to standard: No single definition of what a “customer” or a “sale” is amongst the systems that support the business. That one is like back pain, everyone has it at some point.
Second, define the business case in practical terms with examples. Once you define the business problems that exist due to lack of a sound data integration strategy and technologies, it’s time to put money behind those numbers. Those in IT have a tendency to either way overstate, or way understate the amount of money that’s being wasted and thus could be saved by using data integration approaches and technology. So, provide practical numbers that you can back-up with existing data.
Finally, focus on a phased approach to implementing your data integration solution. The “Big Bang Theory” is a great way to define the beginning of the universe, but it’s not the way you want to define the rollout of your data integration technology. Define a workable plan that moves from one small grouping of systems and databases to another, over time, and with a reasonable amount of resources and technology. You do this to remove risk from the effort, as well as manage costs, and insure that you can dial lessons learned back into the efforts. I would rather roll out data integration within an enterprises using small teams and more problem domains, than attempt to do everything within a few years.
The reality is that data integration is no longer optional for enterprises these days. It’s required for so many reasons, from data sharing, information visibility, compliance, security, automation…the list goes on and on. IT needs to take point on this effort. Selling data integration internally is the first and most important step. Go get ‘em.