Tag Archives: Data Integration
This blog post initially appeared on CMSwire.com and is reblogged here with their consent.
Friends of mine were remodeling their master bath. After searching for a claw foot tub in stores and online, they found the perfect one that fit their space. It was only available for purchase on the retailer’s e-commerce site, they bought it online.
When it arrived, the tub was too big. The dimensions online were incorrect. They went to return it to the closest store, but were told they couldn’t — because it was purchased online, they had to ship it back.
The retailer didn’t have a total customer relationship view or a single view of product information or inventory across channels and touch points. This left the customer representative working with a system that was a silo of limited information. She didn’t have access to a rich customer profile. She didn’t know that Joe and his wife spent almost $10,000 with the brand in the last year. She couldn’t see the products they bought online and in stores. Without this information, she couldn’t deliver a great customer experience.
It was a terrible customer experience. My friends share it with everyone who asks about their remodel. They name the retailer when they tell the story. And, they don’t shop there anymore. This terrible customer experience is negatively impacting the retailer’s revenue and brand reputation.
Bad customer experiences happen a lot. Companies in the US lose an estimated $83 billion each year due to defections and abandoned purchases as a direct result of a poor experience, according to a Datamonitor/Ovum report.
Customer Experience is the New Marketing
Gartner believes that by 2016, companies will compete primarily on the customer experiences they deliver. So who should own customer experience?
Twenty-five percent of CMOs say that their CEOs expect them to lead customer experience. What’s their definition of customer experience? “The practice of centralizing customer data in an effort to provide customers with the best possible interactions with every part of the company, from marketing to sales and even finance.”
Mercedes Benz USA President and CEO, Steve Cannon said, “Customer experience is the new marketing.”
The Gap Between Customer Expectations + Your Ability to Deliver
My previous post, 3 Barriers to Delivering Omnichannel Experiences, explained how omnichannel is all about seeing your business through the eyes of your customer. Customers don’t think in terms of channels and touch points, they just expect a seamless, integrated and consistent customer experience. It’s one brand to the customer. But there’s a gap between customer expectations and what most businesses can deliver today.
Most companies who sell through multiple channels operate in silos. They are channel-centric rather than customer-centric. This business model doesn’t empower employees to deliver seamless, integrated and consistent customer experiences across channels and touch points. Different leaders manage each channel and are held accountable to their own P&L. In most cases, there’s no incentive for leaders to collaborate.
Old Navy’s CMO, Ivan Wicksteed got it right when he said,
“Seventy percent of searches for Old Navy are on a mobile device. Consumers look at the product online and often want to touch it in the store. The end goal is not to get them to buy in the store. The end goal is to get them to buy.”
The end goal is what incentives should be based on.
Executives at most organizations I’ve spoken with admit they are at the very beginning stages of their journey to becoming omnichannel retailers. They recognize that empowering employees with a total customer relationship view and a single view of product information and inventory across channels are critical success factors.
Becoming an omnichannel business is not an easy transition. It forces executives to rethink their definition of customer-centricity and whether their business model supports it. “Now that we need to deliver seamless, integrated and consistent customer experiences across channels and touch points, we realized we’re not as customer-centric as we thought we were,” admitted an SVP of marketing at a financial services company.
You Have to Transform Your Business
“We’re going through a transformation to empower our employees to deliver great customer experiences at every stage of the customer journey,” said Chris Brogan, SVP of Strategy and Analytics at Hyatt Hotels & Resorts. “Our competitive differentiation comes from knowing our customers better than our competitors. We manage our customer data like a strategic asset so we can use that information to serve customers better and build loyalty for our brand.”
Hyatt uses data integration, data quality and master data management (MDM) technology to connect the numerous applications that contain fragmented customer data including sales, marketing, e-commerce, customer service and finance. It brings the core customer profiles together into a single, trusted location, where they are continually managed. Now its customer profiles are clean, de-duplicated, enriched and validated. Members of a household as well as the connections between corporate hierarchies are now visible. Business and analytics applications are fueled with this clean, consistent and connected information so customer-facing teams can do their jobs more effectively.
When he first joined Hyatt, Brogan did a search for his name in the central customer database and found 13 different versions of himself. This included the single Chris Brogan who lived across the street from Wrigley Field with his buddies in his 20s and the Chris Brogan who lives in the suburbs with his wife and two children. “I can guarantee those two guys want something very different from a hotel stay,” he joked. Those guest profiles have now been successfully consolidated.
According to Brogan,
“Successful marketing, sales and customer experience initiatives need to be built on a solid customer data foundation. It’s much harder to execute effectively and continually improve if your customer data is a mess.”
Improving How You Manage, Use and Analyze Data is More Important Than Ever
Some companies lack a single view of product information across channels and touch points. About 60 percent of retail managers believe that shoppers are better connected to product information than in-store associates. That’s a problem. The same challenges exist for product information as customer information. How many different systems contain valuable product information?
Harrods overcame this challenge. The retailer has a strategic initiative to transform from a single iconic store to an omnichannel business. In the past, Harrods’ merchants managed information for about 500,000 products for the store point of sale system and a few catalogs. Now they are using product information management technology (PIM) to effectively manage and merchandise 1.7 million products in the store and online.
Because they are managing product information centrally, they can fuel the ERP system and e-commerce platform with full, searchable multimedia product information. Harrods has also reduced the time it takes to introduce new products and generate revenue from them. In less than one hour, buyers complete the process from sourcing to market readiness.
It Ends with Satisfied Customers
By 2016, you will need to be ready to compete primarily on the customer experiences you deliver across channels and touch points. This means really knowing who your customers are so you can serve them better. Many businesses will transform from a channel-centric business model to a truly customer-centric business model. They will no longer tolerate messy data. They will recognize the importance of arming marketing, sales, e-commerce and customer service teams with the clean, consistent and connected customer, product and inventory information they need to deliver seamless, integrated and consistent experiences across touch points. And all of us will be more satisfied customers.
The verdict is in. Data is now broadly perceived as a source of competitive advantage. We all feel the heat to deliver good data. It is no wonder organizations view Analytics initiatives as highly strategic. But the big question is, can you really trust your data? Or are you just creating pretty visualizations on top of bad data?
We also know there is a shift towards self-service Analytics. But did you know that according to Gartner, “through 2016, less than 10% of self-service BI initiatives will be governed sufficiently to prevent inconsistencies that adversely affect the business”?1 This means that you may actually show up at your next big meeting and have data that contradicts your colleague’s data. Perhaps you are not working off of the same version of the truth. Maybe you have siloed data on different systems and they are not working in concert? Or is your definition of ‘revenue’ or ‘leads’ different from that of your colleague’s?
So are we taking our data for granted? Are we just assuming that it’s all available, clean, complete, integrated and consistent? As we work with organizations to support their Analytics journey, we often find that the harsh realities of data are quite different from perceptions. Let’s further investigate this perception gap.
For one, people may assume they can easily access all data. In reality, if data connectivity is not managed effectively, we often need to beg borrow and steal to get the right data from the right person. If we are lucky. In less fortunate scenarios, we may need to settle for partial data or a cheap substitute for the data we really wanted. And you know what they say, the only thing worse than no data is bad data. Right?
Another common misperception is: “Our data is clean. We have no data quality issues”. Wrong again. When we work with organizations to profile their data, they are often quite surprised to learn that their data is full of errors and gaps. One company recently discovered within one minute of starting their data profiling exercise, that millions of their customer records contained the company’s own address instead of the customers’ addresses… Oops.
Another myth is that all data is integrated. In reality, your data may reside in multiple locations: in the cloud, on premise, in Hadoop and on mainframe and anything in between. Integrating data from all these disparate and heterogeneous data sources is not a trivial task, unless you have the right tools.
And here is one more consideration to mull over. Do you find yourself manually hunting down and combining data to reproduce the same ad hoc report over and over again? Perhaps you often find yourself doing this in the wee hours of the night? Why reinvent the wheel? It would be more productive to automate the process of data ingestion and integration for reusable and shareable reports and Analytics.
Simply put, you need great data for great Analytics. We are excited to host Philip Russom of TDWI in a webinar to discuss how data management best practices can enable successful Analytics initiatives.
And how about you? Can you trust your data? Please join us for this webinar to learn more about building a trust-relationship with your data!
- Gartner Report, ‘Predicts 2015: Power Shift in Business Intelligence and Analytics Will Fuel Disruption’; Authors: Josh Parenteau, Neil Chandler, Rita L. Sallam, Douglas Laney, Alan D. Duncan; Nov 21 2014
Back in 2004, we saw the rapid growth of SaaS providers such as Salesforce.com. However, there was typically no consistent data integration strategy to go along with the use of SaaS. In many instances, SaaS-delivered applications became the new data silos in the enterprise, silos that lacked a sound integration plan and integration technology.
10 years later, we’ve gotten to a point where we have the ability to solve problems using SaaS and data integration problems around the use of SaaS. However, we typically lack the knowledge and understanding of how to effectively use data integration technology within an enterprise to integrate SaaS problem domains.
Lawson looks at both sides of the SaaS integration argument. “Surveys certainly show that integration is less of a concern for SaaS than in the early days, when nearly 88 percent of SaaS companies said integration concerns would slow down adoption and more than 88 percent said it’s an important or extremely important factor in winning new customers.”
Again, while we’ve certainly gotten better at integration, we’re nowhere near being out of the woods. “A Dimensional Research survey of 350 IT executives showed that 67 percent cited data integration problems as a challenge with SaaS business applications. And as with traditional systems, integration can add hidden costs to your project if you ignore it.”
As I’ve stated many times in this blog, integration requires a bit of planning and the use of solid technology. While this does require some extra effort and money, the return on the value of this work is huge.
SaaS integration requires that you take a bit of a different approach than traditional enterprise integration. SaaS systems typically place your data behind well-defined APIs that can be accessed directly or through a data integration technology. While the information can be consumed by anything that can invoke an API, enterprises still have to deal with structure and content differences, and that’s typically best handled using the right data integration technology.
Other things to consider, things that are again often overlooked, is the need for both data governance and data security around your SaaS integration solution. There should be a centralized control mechanism to support the proper management and security of the data, as well as a mechanism to deal with data quality issues that often emerge when consuming data from any cloud computing services.
The reality is that SaaS is here to stay. Even enterprise software players that put off the move to SaaS-delivered systems, are not standing up SaaS offerings. The economics around the use of SaaS are just way to compelling. However, as SaaS-delivered systems become more common place, so will the emergence of new silos. This will not be an issue, if you leverage the right SaaS integration approach and technology. What will your approach be?
A friend of mine recently reached out to me about some advice on CRM solutions in the market. Though I have not worked for a CRM vendor, I’ve had both direct experience working for companies that implemented such solutions to my current role interacting with large and small organizations regarding their data requirements to support ongoing application investments across industries. As we spoke, memories started to surface when he and I had worked on implementing Salesforce.com (SFDC) many years ago. Memories that we wanted to forget but important to call out given his new situation.
We worked together for a large mortgage lending software vendor selling loan origination solutions to brokers and small lenders mainly through email and snail mail based marketing. He was responsible for Marketing Operations, and I ran Product Marketing. The company looked at Salesforce.com to help streamline our sales operations and improve how we marketed and serviced our customers. The existing CRM system was from the early 90’s and though it did what the company needed it to do, it was heavily customized, costly to operate, and served its life. It was time to upgrade, to help grow the business, improve business productivity, and enhance customer relationships.
After 90 days of rolling out SFDC, we ran into some old familiar problems across the business. Sales reps continued to struggle in knowing who was a current customer using our software, marketing managers could not create quality mailing lists for prospecting purposes, and call center reps were not able to tell if the person on the other end was a customer or prospect. Everyone wondered why this was happening given we adopted the best CRM solution in the market. You can imagine the heartburn and ulcers we all had after making such a huge investment in our new CRM solution. C-Level executives were questioning our decisions and blaming the applications. The truth was, the issues were not related to SFDC but the data that we had migrated into the system and the lack proper governance and a capable information architecture to support the required data management integration between systems that caused these significant headaches.
During the implementation phase, IT imported our entire customer database of 200K+ unique customer entities from the old system to SFDC. Unfortunately, the mortgage industry was very transient and on average there were roughly 55K licenses mortgage brokers and lenders in the market and because no one ever validated the accuracy of who was really a customer vs. someone who had ever bought out product, we had a serious data quality issues including:
- Trial users who purchased evaluation copies of our products that expired were tagged as current customers
- Duplicate records caused by manual data entry errors consisting of companies with similar but entered slightly differently with the same business address were tagged as unique customers
- Subsidiaries of parent companies in different parts of the country that were tagged again as a unique customer.
- Lastly, we imported the marketing contact database of prospects which were incorrectly accounted for as a customer in the new system
We also failed to integrate real-time purchasing data and information from our procurement systems for sales and support to handle customer requests. Instead of integrating that data in real-time with proper technology, IT had manually loaded these records at the end of the week via FTP resulting in incorrect billing information, statement processing, and a ton of complaints from customers through our call center. The price we paid for not paying attention to our data quality and integration requirements before we rolled out Salesforce.com was significant for a company of our size. For example:
- Marketing got hit pretty hard. Each quarter we mailed evaluation copies of new products to our customer database of 200K, each costing the company $12 per to produce and mail. Total cost = $2.4M annually. Because we had such bad data, we would get 60% of our mailings returned because of invalid addresses or wrong contact information. The cost of bad data to marketing = $1.44M annually.
- Next, Sales struggled miserably when trying to upgrade a customer by running cold call campaigns using the names in the database. As a result, sales productivity dropped by 40% and experienced over 35% sales turnover that year. Within a year of using SFDC, our head of sales got let go. Not good!
- Customer support used SFDC to service customers, our average all times were 40 min per service ticket. We had believed that was “business as usual” until we surveyed what reps were spending their time each day and over 50% said it was dealing with billing issues caused by bad contact information in the CRM system.
At the end of our conversation, this was my advice to my friend:
- Conduct a data quality audit of the systems that would interact with the CRM system. Audit how complete your critical master and reference data is including names, addresses, customer ID, etc.
- Do this before you invest in a new CRM system. You may find that much of the challenges faced with your existing applications may be caused by the data gaps vs. the legacy application.
- If they had a data governance program, involve them in the CRM initiative to ensure they understand what your requirements are and see how they can help.
- However, if you do decide to modernize, collaborate and involve your IT teams, especially between your Application Development teams and your Enterprise Architects to ensure all of the best options are considered to handle your data sharing and migration needs.
- Lastly, consult with your technology partners including your new CRM vendor, they may be working with solution providers to help address these data issues as you are probably not the only one in this situation.
CRM systems have come a long way in today’s Big Data and Cloud Era. Many firms are adopting more flexible solutions offered through the Cloud like Salesforce.com, Microsoft Dynamics, and others. Regardless of how old or new, on premise or in the cloud, companies invest in CRM not to just serve their sales teams or increase marketing conversion rates, but to improve your business relationship with your customers. Period! It’s about ensuring you have data in these systems that is trustworthy, complete, up to date, and actionable to improve customer service and help drive sales of new products and services to increase wallet share. So how to do you maximize your business potential from these critical business applications?
Whether you are adopting your first CRM solution or upgrading an existing one, keep in mind that Customer Relationship Management is a business strategy, not just a software purchase. It’s also about having a sound and capable data management and governance strategy supported by people, processes, and technology to ensure you can:
- Access and migrate data from old to new avoiding develop cost overruns and project delays.
- Identify, detect, and distribute transactional and reference data from existing systems into your front line business application in real-time!
- Manage data quality errors including duplicate records, invalid names and contact information due to proper data governance and proactive data quality monitoring and measurement during and after deployment
- Govern and share authoritative master records of customer, contact, product, and other master data between systems in a trusted manner.
Will your data be ready for your new CRM investments? To learn more:
- Download Salesforce Integration for Dummies
- Download a new Whitepaper on how to Maximize Integration ROI with a Hybrid Approach
- Consolidating Multiple Salesforce Orgs: A Best Practice Guide
- Sign up for a 30 Day Trial of Informatica Cloud Integration
Follow me on Twitter @DataisGR8
The first architect grew through the ranks starting as a Database Administrator, a black belt in SQL and COBOL programming. Hand coding was their DNA for many years and thought of as the best approach given how customized their business and systems were vs. other organizations. As such, Architect #1 and their team went down the path of building their data management capabilities through custom hand coded scripts, manual data extractions and transformations, and dealing with data quality issues through the business organizations after the data is delivered. Though their approach and decisions delivered on their short term needs, the firm realized the overhead required to make changes and respond to new requests driven by new industry regulations and changing market conditions.
The second architect is a “gadget guy” at heart who grew up using off the shelf tools vs. hand coding for managing data. He and his team decides not to hand code their data management processes, instead adopt and built their solution leveraging best of breed tools, some of which were open source, others from existing solutions the company had from previous projects for data integration, data quality, and metadata management. Though their tools helped automate much of the “heavy lifting” he and is IT team were still responsible for integrating these point solutions to work together which required ongoing support and change management.
The last architect is as technically competent as his peers however understood the value of building something once to use across the business. His approach was a little different than the first two. Understanding the risks and costs of hand coding or using one off tools to do the work, he decided to adopt an integrated platform designed to handle the complexities, sources, and volumes of data required by the business. The platform also incorporated shared metadata, reusable data transformation rules and mappings, a single source of required master and reference data, and provided agile development capabilities to reduce the cost of implementation and ongoing change management. Though this approach was more expensive to implement, the long term cost benefit and performance benefits made the decision a “no brainer’.
Lurking in the woods is Mr. Wolf. Mr. Wolf is not your typical antagonist however is a regulatory auditor whose responsibility is to ensure these banks can explain how risk is calculated as reported to the regulatory authorities. His job isn’t to shut these banks down, instead making sure the financial industry is able to measure risk across the enterprise, explain how risk is measured, and ensure these firms are adequately capitalized as mandated by new and existing industry regulations.
Mr. Wolf visits the first bank for an annual stress test audit. Looking at the result of their stress test, he asks the compliance teams to explain how their data was produced, transformed, calculated, to support the risk measurements they reported as part of the audit. Unfortunately, due to the first architect’s recommendations of hand coding their data management processes, IT failed to provide explanations and documentation on what they did, they found the developers that created their systems were no longer with the firm. As a result, the bank failed miserably, resulting in stiff penalties and higher audit costs.
Next, Architect #2’s bank was next. Having heard of what happened to their peer in the news, the architect and IT teams were confident that they were in good shape to pass their stress test audit. After digging into the risk reports, Mr. Wolf questioned the validity of the data used to calculate Value at Risk (VaR). Unfortunately, the tools that were adopted were never designed nor guaranteed by the vendors to work with each other resulting in invalid data mapping and data quality rules and gaps within their technical metadata documentation. As a result, bank #2 also failed their audit and found themselves with a ton of on one-off tools that helped automate their data management processes but lacked the integration and sharing of rules and metadata to satisfy the regulator’s demand for risk transparency.
Finally, Mr. Wolf investigated Architect #3’s firm. Having seen the result of the first two banks, Mr. Wolf was leery of their ability to pass their stress test audits. Similar demands were presented by Mr. Wolf however this time, Bank #3 provided detailed and comprehensive metadata documentation of their risk data measurements, descriptions of the data used in each report, an comprehensive report of each data quality rule used to cleanse their data, and detailed information on each counterparty and legal entity used to calculate VaR. Unable to find gaps in their audit, Mr. Wolf, expecting to “blow” the house down, delivered a passing grade for Bank 3 and their management team due to the right investments they made to support their enterprise risk data management needs.
The moral of this story, similar to the familiar one involving the three little pigs is about the importance of having a solid foundation to weather market and regulatory storms or the violent bellow of a big bad wolf. A foundation that includes the required data integration, data quality, master data management, and metadata management needs but also supports collaboration and visibility of how data is produced, used, and performing across the business. Ensuring current and future compliance in today’s financial services industry requires firms to have a solid data management platform, one that is intelligent, comprehensive, and allows Information Architects to help mitigate the risks and costs of hand coding or using point tools to get by only in the short term.
Are you prepared to meet Mr. Wolf?
Have you noticed something different this winter season that most people are cheery about? I’ll give you a hint. It’s not the great sales going on at your local shopping mall but something that helps you get to the mall allot more affordable then last year. It’s the extremely low gas prices across the globe, fueled by over-supply of oil vs. demand contributed from a boom in Geo-politics and boom in shale oil production in N. America and abroad. Like any other commodity, it’s impossible to predict where oil prices are headed however, one thing is sure that Oil and Gas companies will need timely and quality data as firms are investing in new technologies to become more agile, innovative, efficient, and competitive as reported by a recent IDC Energy Insights Predictions report for 2015.
The report predicts:
- 80% of the top O&G companies will reengineer processes and systems to optimize logistics, hedge risk and efficiently and safely deliver crude, LNG, and refined products by the end of 2017.
- Over the next 3 years, 40% of O&G majors and all software divisions of oilfield services (OFS) will co-innovate on domain specific technical projects with IT professional service firms.
- The CEO will expect immediate and accurate information about top Shale Plays to be available by the end of 2015 to improve asset value by 30%.
- By 2016, 70% percent of O&G companies will have invested in programs to evolve the IT environment to a third platform driven architecture to support agility and readily adapt to change.
- With continued labor shortages and over 1/3 of the O&G workforce under 45 in three years, O&G companies will turn to IT to meet productivity goals.
- By the end of 2017, 100% of the top 25 O&G companies will apply modeling and simulation tools and services to optimize oil field development programs and 25% will require these tools.
- Spending on connectivity related technologies will increase by 30% between 2014 and 2016, as O&G companies demand vendors provide the right balance of connectivity for a more complex set of data sources.
- In 2015, mergers, acquisitions and divestitures, plus new integrated capabilities, will drive 40% of O&G companies to re-evaluate their current deployments of ERP and hydrocarbon accounting.
- With a business case built on predictive analytics and optimization in drilling, production and asset integrity, 50% of O&G companies will have advanced analytics capabilities in place by 2016.
- With pressures on capital efficiency, by 2015, 25% of the Top 25 O&G companies will apply integrated planning and information to large capital projects, speeding up delivery and reducing over-budget risks by 30%.
Realizing value from these investments will also require Oil and Gas firms to modernize and improve their data management infrastructure and technologies to deliver great data whether to fuel actionable insights from Big Data technology to facilitating post-merger application consolidation and integration activities. Great data is only achievable by Great Design supported by capable solutions designed to help access and deliver timely, trusted, and secure data to need it most.
Lack of proper data management investments and competences have long plagued the oil and gas sector with “less-than acceptable” data and higher operating costs. According to the “Upstream Data and Information Management Survey” conducted by Wipro Technologies, 56% of those surveyed felt that business users spent more than ¼ or more of their time on low value activities caused by existing data issues (e.g. accessing, cleansing, preparing data) for “high value” activities (e.g. analysis, planning, decision making). The same survey showed the biggest data management issues were timely access to required data and data quality issues from source systems.
So what can Oil and Gas CIO’s and Enterprise Architects do to prepare for the future? Here are some tips for consideration:
- Look to migrate and automate legacy hand coded data transformation processes by adopting tools that can help streamline the development, testing, deployment, and maintenance of these complex tasks that help developers build, maintain, and monitor data transformation rules once and deploy them across the enterprise.
- Simplify how data is distributed across systems with more modern architectures and solutions and avoid the cost and complexities of point to point integrations
- Deal with and manage data quality upstream at the source and throughout the data life cycle vs. having end users fix unforeseen data quality errors manually.
- Create a centralized source of shared business reference and master data that can manage a consistent record across heterogeneous systems such as well asset/material information (wellhead, field, pump, valve, etc.), employee data (drill/reservoir engineer, technician), location data (often geo-spatial), and accounting data (for financial roll-ups of cost, production data).
- Establish standards and repeatable best practices by adopting an Integration Competency Center frame work to support the integration and sharing of data between operational and analytical systems.
In summary, low oil prices have a direct and positive impact to consumers especially during the winter season and holidays and I personally hope they continue for the unforeseeable future given that prices were double just a year ago. Unfortunately, no one can predict future energy prices however one thing is for sure, the demand for great data by Oil and Gas companies will continue to grow. As such, CIO’s and Enterprise Architects will need to consider and recognize the importance of improving their data management capabilities and technologies to ensure success in 2015. How ready are you?
Click to learn more about Informatica in today’s Energy Sector:
The articles cites some research from Ovum, that predicts many enterprises will begin moving toward data integration, driven largely by the rise of cloud computing and big data. However, enterprises need to invest in both modernizing the existing data management infrastructure, as well as invest in data integration technology. “All of these new investments will push the middleware software market up 9 percent to a $16.3 billion industry, Information Management reports.” This projection is for 2015.
I suspect that’s a bit conservative. In my travels, I see much more interest in data integration strategies, approaches, and technology, as cloud computing continues to grow, as well as enterprises understand better the strategic use of data. So, I would put the growth at 15 percent for 2015.
There are many factors driving this growth, beyond mere interest in cloud computing and big data.
The first consideration is that data is more strategic than initially understood. While businesses have always considered data a huge asset, it has not been until the last few years that businesses have seen the true value of understanding what’s going on inside, and outside of their business.
Manufacturing companies want to see the current state of production, as well as production history. Management can now use that data to predict trends to address, such as future issues around employee productivity, or even a piece of equipment that is likely to fail and the impact of that failure on revenue. Healthcare companies are learning how to better monitor patient health, such as spotting likely health problems before they are diagnosed, or leveraging large data to understand when patterns emerge around health issues, such as areas of the country that are more prone to asthma, based upon air quality.
Second, there is the need to deal with compliance issues. The new health care regulations, or even the new regulation around managing a publically traded company, require a great deal of data management issues, including data integration.
As these laws emerge, and are altered over time, the reporting requirements are always more complex and far reaching than they were before. Those who want to avoid fines, or even avoid stock drops around mistakes, are paying close attention to this area.
Finally, there is an expectation from customers and employees that you will have a good handle on your data. 10 years ago you could tell a customer on the phone that you needed to check different systems to answer their question. Those days are over. Today’s customers and employees want immediate access to the data they need, and there is no good excuse for not being able to produce that data. If you can’t, your competition will.
The interest in data integration will experience solid growth in 2015, around cloud and big data, for sure. However, other factors will drive this growth, and enterprises will finally understand that data integration is core to an IT strategy, and should never be an afterthought.
It takes a village to build mainstream big data solutions. We often get so caught up in Hadoop use cases and customer successes that sometimes we don’t talk enough about the innovative partner technologies and integrations that enable our customers to put the enterprise data hub at the core of their data architecture and innovate with confidence. Cloudera and Informatica have been working together to integrate our products to enable new levels of productivity and lower deployment and production risk.
Going from Hadoop to an enterprise data hub, means a number of things. It means that you recognize the business value of capturing and leveraging all your data for exploration and analytics. It means you’re ready to make the move from Hadoop pilot project to production. And it means your data is important enough that it’s worth securing and making data pipelines visible. It’s the visibility layer, and in particular, the unique integration between Cloudera Navigator and Informatica that I want to focus on in this post.
The era of big data has ushered in increased regulations in a number of industries – banking, retail, healthcare, energy – most of which deal in how data is managed throughout its lifecycle. Cloudera Navigator is the only native end-to-end solution for governance in Hadoop. It provides visibility for analysts to explore data in Hadoop, and enables administrators and managers to maintain a full audit history for HDFS, HBase, Hive, Impala, Spark and Sentry then run reports on data access for auditing and compliance.The integration of Informatica Metadata Manager in the Big Data Edition and Cloudera Navigator extends this level of visibility and governance beyond the enterprise data hub.
Today, only Informatica and Cloudera provide end-to-end data lineage from source systems through Hadoop, and into BI/analytic and data warehouse systems. And you can view it from a single pane within Informatica.
This is important because Hadoop, and the enterprise data hub in particular, doesn’t function in a silo. It’s an integrated part of a larger enterprise-wide data management architecture. The better the insight into where data originated, where it traveled, who had access to it and what they did with it, the greater our ability to report and audit. No other combination of technologies provides this level of audit granularity.
But more so than that, the visibility Cloudera and Informatica provides our joint customers with the ability to confidently stand up an enterprise data hub as a part of their production enterprise infrastructure because they can verify the integrity of the data that undergirds their analytics. I encourage you to check out a demo of the Informatica-Cloudera Navigator integration at this link: http://infa.media/1uBpPbT
You can also check out a demo and learn a little more about Cloudera Navigator and the Informatica integration in the recorded TechTalk hosted by Informatica at this link:
Building an Enterprise Data Hub with proper Data IntegrationData flows into the enterprise from many sources, in many formats, sizes, and levels of complexity. And as enterprise architectures have evolved over the years, traditional data warehouses have become less of a final staging center for data, but rather, one component of the enterprise that interfaces with significant data flows. But since data warehouses should focus on being powerful engines for high value analytics, they should not be the central hub for data movement and data preparation (e.g. ETL/ELT), especially for the newer data types–such as social media, clickstream data, sensor data, internet-of-things-data, etc.–that are in use today.
When you start seeing data warehouse capacity consumed too quickly and performance degradation where end users are complaining about slower response times, and you risk not meeting your service-level agreements, then it might be time to consider an enterprise data hub (EDH). With an EDH, especially one built on Apache™ Hadoop®, you can plan a strategy around data warehouse optimization to get better use out of your entire enterprise architecture.
Of course, whenever you add another new technology to your data center, you care about interoperability. And since many systems in today’s architectures interoperate via data flows, it’s clear that sophisticated data integration technologies will be an important part of your EDH strategy. Today’s big data presents new challenges as relates to a wide variety of data types and formats, and the right technologies are needed to glue all the pieces together, whether those pieces are data warehouses, relational databases, Hadoop, or NoSQL databases.
Choosing a Data Integration Solution
Data integration software, at a high level, has one broad responsibility: to help you process and prepare your data with the right technology. This means it has to get your data to the right place in the right format in a timely manner. So it actually includes many tasks, but the end result is that timely, trusted data can be used for decision-making and risk management throughout the enterprise. You end up with a complete, ready-for-analysis picture of your business, as opposed to segmented snapshots based on a limited data set.
When evaluating a data integration solution for the enterprise, look for:
- Ease of use to boost developer productivity
- A proven track record in the industry
- Widely available technology expertise
- Experience with production deployments with newer technologies like Hadoop
- Ability to reuse data pipelines across different technologies (e.g. data warehouse, RDBMS, Hadoop, and other NoSQL databases)
Data integration is only part of the story. When you’re depending on data to drive business decisions and risk management, you clearly want to ensure the data is reliable. Data governance, data lineage, data quality, and data auditing remain as important topics in an EDH. Oftentimes, data privacy regulatory demands must be met, and the enterprise’s own intellectual property must be protected from accidental exposure.
To help ensure that data is sound and secure, look for a solution that provides:
- Centralized management and control
- Data certification prior to publication, transparent data and integration processes, and the ability to track data lineage
- Granular security, access controls, and data masking to protect data both in transit and at the source to prevent unauthorized access to specific data sets
Informatica is the data integration solution selected by many enterprises. Informatica’s family of enterprise data integration, data quality, and other data management products can manage data — of any format, complexity level, or size –from any business system, and then deliver that data across the enterprise at the desired speed.
Watch the latest Gartner video to see Todd Goldman, Vice President and General Manager for Enterprise Data Integration at Informatica, as well as executives from Cisco and MapR, give their perspective on how businesses today can gain even more value from big data.
How would you like to wake up to an extra billion dollars, or maybe nine, in the bank? This has happened to a teacher in India. He discovered to his astonishment a balance of $9.8 billion in his bank account!
How would you like to be the bank who gave the client an extra nine Billion dollars? Oh, to be a fly on the wall when the IT department got that call. How do you even begin to explain? Imagine the scrambling to track down the source of the data error.
This was a glaringly obvious error, which is easily caught. But there is potential for many smaller data errors. These errors may go undetected and add up hurting your bottom line. How could this type of data glitch happen? More importantly, how can you protect your organization from these types of errors in your data?
A primary source of data mistakes is insufficient testing during Data Integration. Any change or movement of data harbors risk to its integrity. Unfortunately there are often insufficient IT resources to adequately validate the data. Some organizations validate the data manually. This is a lengthy, unreliable process, fraught with data errors. Furthermore manual testing does not scale well to large data volumes or complex data changes. So the validation is often incomplete. Finally some organizations simply lack the resources to conduct any level of data validation altogether.
Many of our customers have been able to successfully address this issue via automated data validation testing. (Also known as DVO). In a recent TechValidate survey, Informatica customers have told us that they:
- Reduce costs associated with data testing.
- Reduce time associated with data testing.
- Increase IT productivity.
- Increase the business trust in the data.
Customers tell us some of the biggest potential costs relate to damage control which occurs when something goes wrong with their data. The tale above, of our fortunate man and not so fortunate bank, can be one example. Bad data can hurt a company’s reputation and lead to untold losses in market-share and customer goodwill. In today’s highly regulated industries, such as healthcare and financial services, consequences of incorrect data can be severe. This can include heavy fines or worse.
Using automated data validation testing allows customers to save on ongoing testing costs and deliver reliable data. Just as important, it prevents pricey data errors, which require costly and time-consuming damage control. It is no wonder many of our customers tell us they are able to recoup their investment in less than 12 months!
TechValidate survey shows us that customers are using data validation testing in a number of common use cases including:
- Regression (Unit) testing
- Application migration or consolidation
- Software upgrades (Applications, databases, PowerCenter)
- Production reconciliation
One of the most beneficial use cases for data validation testing has been for application migration and consolidation. Many SAP migration projects undertaken by our customers have greatly benefited from automated data validation testing. Application migration or consolidation projects are typically large and risky. A Bloor Research study has shown 38% of data migration projects fail, incurring overages or are aborted altogether. According to a Harvard Business Review article, 1 in 6 large IT projects run 200% over budget. Poor data management is one of the leading pitfalls in these types of projects. However, according to Bloor Research, Informatica’ s data validation testing is a capability they have not seen elsewhere in the industry.
A particularly interesting example of above use case is in the case of M&A situation. The merged company is required to deliver ‘day-1 reporting’. However FTC regulations forbid the separate entities from seeing each other’s data prior to the merger. What a predicament! The automated nature of data validation testing, (Automatically deploying preconfigured rules on large data-sets) enables our customers to prepare for successful day-1 reporting under these harsh conditions.
And what about you? What are the costs to your business for potentially delivering incorrect, incomplete or missing data? To learn more about how you can provide the right data on time, every time, please visit www.datavalidation.me