Category Archives: Data Integration
“Inaccurate, inconsistent and disconnected supplier information prohibits us from doing accurate supplier spend analysis, leveraging discounts, comparing and choosing the best prices, and enforcing corporate standards.”
This is quotation from a manufacturing company executive. It illustrates the negative impact that poorly managed supplier information can have on a company’s ability to cut costs and achieve revenue targets.
Many supply chain and procurement teams at large companies struggle to see the total relationship they have with suppliers across product lines, business units and regions. Why? Supplier information is scattered across dozens or hundreds of Enterprise Resource Planning (ERP) and Accounts Payable (AP) applications. Too much valuable time is spent manually reconciling inaccurate, inconsistent and disconnected supplier information in an effort to see the big picture. All this manual effort results in back office administrative costs that are higher than they should be.
Do these quotations from supply chain leaders and their teams sound familiar?
“We have 500,000 suppliers. 15-20% of our supplier records are duplicates. 5% are inaccurate.”
“I get 100 e-mails a day questioning which supplier to use.”
“To consolidate vendor reporting for a single supplier between divisions is really just a guess.”
“Every year 1099 tax mailings get returned to us because of invalid addresses, and we play a lot of Schedule B fines to the IRS.”
“Two years ago we spent a significant amount of time and money cleansing supplier data. Now we are back where we started.”
Please join me and Naveen Sharma, Director of the Master Data Management (MDM) Practice at Cognizant for a Webinar, Supercharge Your Supply Chain Applications with Better Supplier Information, on Tuesday, July 29th at 10 am PT.
During the Webinar, we’ll explain how better managing supplier information can help you achieve the following goals:
- Accelerate supplier onboarding
- Mitiate the risk of supply disruption
- Better manage supplier performance
- Streamline billing and payment processes
- Improve supplier relationship management and collaboration
- Make it easier to evaluate non-compliance with Service Level Agreements (SLAs)
- Decrease costs by negotiating favorable payment terms and SLAs
I hope you can join us for this upcoming Webinar!
This creative thinking to solve a problem came from a request to build a soldier knife from the Swiss Army. In the end, the solution was all about getting the right tool for the right job in the right place. In many cases soldiers didn’t need industrial strength tools, all they really needed was a compact and lightweight tool to get the job at hand done quickly.
Putting this into perspective with today’s world of Data Integration, using enterprise-class data integration tools for the smaller data integration project is over kill and typically out of reach for the smaller organization. However, these smaller data integration projects are just as important as those larger enterprise projects, and they are often the innovation behind a new way of business thinking. The traditional hand-coding approach to addressing the smaller data integration project is not-scalable, not-repeatable and prone to human error, what’s needed is a compact, flexible and powerful off-the-shelf tool.
Thankfully, over a century after the world embraced the Swiss Army Knife, someone at Informatica was paying attention to revolutionary ideas. If you’ve not yet heard the news about the Informatica platform, a version called PowerCenter Express has been released and it is free of charge so you can use it to handle an assortment of what I’d characterize as high complexity / low volume data integration challenges and experience a subset of the Informatica platform for yourself. I’d emphasize that PowerCenter Express doesn’t replace the need for Informatica’s enterprise grade products, but it is ideal for rapid prototyping, profiling data, and developing quick proof of concepts.
PowerCenter Express provides a glimpse of the evolving Informatica platform by integrating four Informatica products into a single, compact tool. There are no database dependencies and the product installs in just under 10 minutes. Much to my own surprise, I use PowerCenter express quite often going about the various aspects of my job with Informatica. I have it installed on my laptop so it travels with me wherever I go. It starts up quickly so it’s ideal for getting a little work done on an airplane.
For example, recently I wanted to explore building some rules for an upcoming proof of concept on a plane ride home so I could claw back some personal time for my weekend. I used PowerCenter Express to profile some data and create a mapping. And this mapping wasn’t something I needed to throw away and recreate in an enterprise version after my flight landed. Vibe, Informatica’s build once / run anywhere metadata driven architecture allows me to export a mapping I create in PowerCenter Express to one of the enterprise versions of Informatica’s products such as PowerCenter, DataQuality or Informatica Cloud.
As I alluded to earlier in this article, being a free offering I honestly didn’t expect too much from PowerCenter Express when I first started exploring it. However, due to my own positive experiences, I now like to think of PowerCenter Express as the Swiss Army Knife of Data Integration.
To start claiming back some of your personal time, get started with the free version of PowerCenter Express, found on the Informatica Marketplace at: https://community.informatica.com/solutions/pcexpress
- It’s difficult to find and retain resource skills to staff big data projects
- It takes too long to deploy Big Data projects from ‘proof-of-concept’ to production
- Big data technologies are evolving too quickly to adapt
- Big Data projects fail to deliver the expected value
- It’s difficult to make Big Data fit-for-purpose, assess trust, and ensure security
Informatica has extended its leadership in data integration and data quality to Hadoop with our Big Data Edition to address all of these Big Data challenges.
The biggest challenge companies’ face is finding and retaining Big Data resource skills to staff their Big Data projects. One large global bank started their first Big Data project with 5 Java developers but as their Big Data initiative gained momentum they needed to hire 25 more Java developers that year. They quickly realized that while they had scaled their infrastructure to store and process massive volumes of data they could not scale the necessary resource skills to implement their Big Data projects. The research mentioned earlier indicates that 80% of the work in a Big Data project relates to data integration and data quality. With Informatica you can staff Big Data projects with readily available Informatica developers instead of an army of developers hand-coding in Java and other Hadoop programming languages. In addition, we’ve proven to our customers that Informatica developers are up to 5 times more productive on Hadoop than hand-coding and they don’t need to know how to program on Hadoop. A large Fortune 100 global manufacturer needed to hire 40 data scientists for their Big Data initiative. Do you really want these hard-to-find and expensive resources spending 80% of their time integrating and preparing data?
Another key challenge is that it takes too long to deploy Big Data projects to production. One of our Big Data Media and Entertainment customers told me prior to purchasing the Informatica Big Data Edition that most of his Big Data projects had failed. Naturally, I asked him why they had failed. His response was, “We have these hot-shot Java developers with a good idea which they prove out in our sandbox environment. But then when it comes time to deploy it to production they have to re-work a lot of code to make it perform and scale, make it highly available 24×7, have robust error-handling, and integrate with the rest of our production infrastructure. In addition, it is very difficult to maintain as things change. This results in project delays and cost overruns.” With Informatica, you can automate the entire data integration and data quality pipeline; everything you build in the development sandbox environment can be immediately and automatically deployed and scheduled for production as enterprise ready. Performance, scalability, and reliability are simply handled through configuration parameters without having to re-build or re-work any development which is typical with hand-coding. And Informatica makes it easier to reuse existing work and maintain Big Data projects as things change. The Big Data Editions is built on Vibe our virtual data machine and provides near universal connectivity so that you can quickly onboard new types of data of any volume and at any speed.
Big Data technologies are emerging and evolving extremely fast. This in turn becomes a barrier to innovation since these technologies evolve much too quickly for most organizations to adopt before the next big thing comes along. What if you place the wrong technology bet and find that it is obsolete before you barely get started? Hadoop is gaining tremendous adoption but it has evolved along with other big data technologies where there are literally hundreds of open source projects and commercial vendors in the Big Data landscape. Informatica is built on the Vibe virtual data machine which means that everything you built yesterday and build today can be deployed on the major big data technologies of tomorrow. Today it is five flavors of Hadoop but tomorrow it could be Hadoop and other technology platforms. One of our Big Data Edition customers, stated after purchasing the product that Informatica Big Data Edition with Vibe is our insurance policy to insulate our Big Data projects from changing technologies. In fact, existing Informatica customers can take PowerCenter mappings they built years ago, import them into the Big Data Edition and can run on Hadoop in many cases with minimal changes and effort.
Another complaint of business is that Big Data projects fail to deliver the expected value. In a recent survey (1), 86% Marketers say they could generate more revenue if they had a more complete picture of customers. We all know that the cost of us selling a product to an existing customer is only about 10 percent of selling the same product to a new customer. But, it’s not easy to cross-sell and up-sell to existing customers. Customer Relationship Management (CRM) initiatives help to address these challenges but they too often fail to deliver the expected business value. The impact is low marketing ROI, poor customer experience, customer churn, and missed sales opportunities. By using Informatica’s Big Data Edition with Master Data Management (MDM) to enrich customer master data with Big Data insights you can create a single, complete, view of customers that yields tremendous results. We call this real-time customer analytics and Informatica’s solution improves total customer experience by turning Big Data into actionable information so you can proactively engage with customers in real-time. For example, this solution enables customer service to know which customers are likely to churn in the next two weeks so they can take the next best action or in the case of sales and marketing determine next best offers based on customer online behavior to increase cross-sell and up-sell conversions.
Chief Data Officers and their analytics team find it difficult to make Big Data fit-for-purpose, assess trust, and ensure security. According to the business consulting firm Booz Allen Hamilton, “At some organizations, analysts may spend as much as 80 percent of their time preparing the data, leaving just 20 percent for conducting actual analysis” (2). This is not an efficient or effective way to use highly skilled and expensive data science and data management resource skills. They should be spending most of their time analyzing data and discovering valuable insights. The result of all this is project delays, cost overruns, and missed opportunities. The Informatica Intelligent Data platform supports a managed data lake as a single place to manage the supply and demand of data and converts raw big data into fit-for-purpose, trusted, and secure information. Think of this as a Big Data supply chain to collect, refine, govern, deliver, and manage your data assets so your analytics team can easily find, access, integrate and trust your data in a secure and automated fashion.
If you are embarking on a Big Data journey I encourage you to contact Informatica for a Big Data readiness assessment to ensure your success and avoid the pitfalls of the top 5 Big Data challenges.
- Gleanster Survey of 100 senior level marketers. The title of this survey is, Lifecycle Engagement: Imperatives for Midsize and Large Companies. Sponsored by YesMail.
- “The Data Lake: Take Big Data Beyond the Cloud”, Booz Allen Hamilton, 2013
“Not only do we underestimate the cost for projects up to 150%, but we overestimate the revenue it will generate.” This quotation from an Energy & Petroleum (E&P) company executive illustrates the negative impact of inaccurate, inconsistent and disconnected well data and asset data on revenue potential.
“Operational Excellence” is a common goal of many E&P company executives pursuing higher growth targets. But, inaccurate, inconsistent and disconnected well data and asset data may be holding them back. It obscures the complete picture of the well information lifecycle, making it difficult to maximize production efficiency, reduce Non-Productive Time (NPT), streamline the oilfield supply chain, calculate well by-well profitability, and mitigate risk.
To explain how E&P companies can better manage well data and asset data, we hosted a webinar, “Attention E&P Executives: Streamlining the Well Information Lifecycle.” Our well data experts Stephanie Wilkin, Senior Principal Consultant at Noah Consulting, and Stephan Zoder, Director of Value Engineering at Informatica shared some advice. E&P companies should reevaluate “throwing more bodies at a data cleanup project twice a year.” This approach does not support the pursuit of operational excellence.
In this interview, Stephanie shares details about the award-winning collaboration between Noah Consulting and Devon Energy to create a single trusted source of well data, which is standardized and mastered.
Q. Congratulations on winning the 2014 Innovation Award, Stephanie!
A. Thanks Jakki. It was really exciting working with Devon Energy. Together we put the technology and processes in place to manage and master well data in a central location and share it with downstream systems on an ongoing basis. We were proud to win the 2014 Innovation Award for Best Enterprise Data Platform.
Q. What was the business need for mastering well data?
A. As E&P companies grow so do their needs for business-critical well data. All departments need clean, consistent and connected well data to fuel their applications. We implemented a master data management (MDM) solution for well data with the goals of improving information management, business productivity, organizational efficiency, and reporting.
Q. How long did it take to implement the MDM solution for well data?
A. The Devon Energy project kicked off in May of 2012. Within five months we built the complete solution from gathering business requirements to development and testing.
Q. What were the steps in implementing the MDM solution?
A: The first and most important step was securing buy-in on a common definition for master well data or Unique Well Identifier (UWI). The key was to create a definition that would meet the needs of various business functions. Then we built the well master, which would be consistent across various systems, such as G&G, Drilling, Production, Finance, etc. We used the Professional Petroleum Data Management Association (PPDM) data model and created more than 70 unique attributes for the well, including Lahee Class, Fluid Direction, Trajectory, Role and Business Interest.
As part of the original go-live, we had three source systems of well data and two target systems connected to the MDM solution. Over the course of the next year, we added three additional source systems and four additional target systems. We did a cross-system analysis to make sure every department has the right wells and the right data about those wells. Now the company uses MDM as the single trusted source of well data, which is standardized and mastered, to do analysis and build reports.
Q. What’s been the traditional approach for managing well data?
A. Typically when a new well is created, employees spend time entering well data into their own systems. For example, one person enters well data into the G&G application. Another person enters the same well data into the Drilling application. A third person enters the same well data into the Finance application. According to statistics, it takes about 30 minutes to enter wells into a particular financial application.
So imagine if you need to add 500 new wells to your systems. This is common after a merger or acquisition. That translates to roughly 250 hours or 6.25 weeks of employee time saved on the well create process! By automating across systems, you not only save time, you eliminate redundant data entry and possible errors in the process.
Q. That sounds like a painfully slow and error-prone process.
A. It is! But that’s only half the problem. Without a single trusted source of well data, how do you get a complete picture of your wells? When you compare the well data in the G&G system to the well data in the Drilling or Finance systems, it’s typically inconsistent and difficult to reconcile. This leads to the question, “Which one of these systems has the best version of the truth?” Employees spend too much time manually reconciling well data for reporting and decision-making.
Q. So there is a lot to be gained by better managing well data.
A. That’s right. The CFO typically loves the ROI on a master well data project. It’s a huge opportunity to save time and money, boost productivity and get more accurate reporting.
Q: What were some of the business requirements for the MDM solution?
A: We couldn’t build a solution that was narrowly focused on meeting the company’s needs today. We had to keep the future in mind. Our goal was to build a framework that was scalable and supportable as the company’s business environment changed. This allows the company to add additional data domains or attributes to the well data model at any time.
Q: Why did you choose Informatica MDM?
A: The decision to use Informatica MDM for the MDM Trust Framework came down to the following capabilities:
- Match and Merge: With Informatica, we get a lot of flexibility. Some systems carry the API or well government ID, but some don’t. We can match and merge records differently based on the system.
- X-References: We keep a cross-reference between all the systems. We can go back to the master well data and find out where that data came from and when. We can see where changes have occurred because Informatica MDM tracks the history and lineage.
- Scalability: This was a key requirement. While we went live after only 5 months, we’ve been continually building out the well master based on the requiremets of the target systems.
- Flexibility: Down the road, if we want to add an additional facet or classification to the well master, the framework allows for that.
- Simple Integration: Instead of building point-to-point integrations, we use the hub model.
In addition to Informatica MDM, our Noah Consulting MDM Trust Framework includes Informatica PowerCenter for data integration, Informatica Data Quality for data cleansing and Informatica Data Virtualization.
Q: Can you give some examples of the business value gained by mastering well data?
A: One person said to me, “I’m so overwhelmed! We’ve never had one place to look at this well data before.” With MDM centrally managing master well data and fueling key business applications, many upstream processes can be optimized to achieve their full potential value.
People spend less time entering well data on the front end and reconciling well data on the back end. Well data is entered once and it’s automatically shared across all systems that need it. People can trust that it’s consistent across systems. Also, because the data across systems is now tied together, it provides business value they were unable to realize before, such as predictive analytics.
Q. What’s next?
A. There’s a lot of insight that can be gained by understanding the relationships between the well, and the people, equipment and facilities associated with it. Next, we’re planning to add the operational hierarchy. For example, we’ll be able to identify which production engineer, reservoir engineer and foreman are working on a particular well.
We’ve also started gathering business requirements for equipment and facilities to be tied to each well. There’s a lot more business value on the horizon as the company streamlines their well information lifecycle and the valuable relationships around the well.
If you missed the webinar, you can watch the replay now: Attention E&P Executives: Streamlining the Well Information Lifecycle.
My first job out of college was to figure out how to get devices that monitored and controlled an advanced cooling and heating system to communicate with a centralized and automated control center. We ended up building custom PCs for the application, running a version of Unix (DOS would not cut it), and the PCs mounted in industrial cases would communicate with the temperature and humidity sensors, as well as turn on and turn off fans and dampers.
At then end of the day, this was a data integration, not an engineering problem, that we were attempting to solve. The devices had to talk to the PCs, and the PC had to talk to a centralized system (Mainframe) that was able to receive the data, as well as use that data to determine what actions to take. For instance, the ability determine that 78 degrees was too warm for a clean room, and that a damper had to be open and a fan turned on to reduce the temperature, and then turn off when the temperature returned to normal.
Back in the day, we had to create and deploy custom drivers and software. These days, most devices have well-defined interfaces, or APIs, that developers and data integration tools can access to gather information from that device. We also have high performing networks. Much like any source or target system, these devices produce data which is typically bound to a structure, and that data can be consumed and restructured to meet the needs of the target system.
For instance, data coming off a smart thermostat in your home may be in the following structure:
Device (char 10)
Date (char 8)
Temp (num 3)
You’re able to access this device using an API (typically a REST-based Web Service), which returns a single chunk of data which is bound to the structure, such as:
Then you can transform the structure into something that’s native to the target system that receives this data, as well as translate the data (e.g., converting the Data form characters to numbers). This is where data integration technology makes money for you, given its ability to deal with the complexity of translating and transforming the information that comes off the device, so it can be placed in a system or data store that’s able to monitor, analyze, and react to this data.
This is really what the IOT is all about; the ability to have devices spin out data that is leveraged to make better use of the devices. The possibilities are endless, as to what can be done with that data, and how we can better manage these devices. Data integration is key. Trust me, it’s much easier to integrate with devices these days than it was back in the day.
Thank you for reading about Data Integration with Devices! Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies“
Recently, I had the opportunity to interview half dozen CIOs and half dozen CFOs. Kind of like a marriage therapist, I got to see each party’s story about the relationship. CFOs, in particular, felt that the quality of the relationship could impact their businesses’ success. Armed with this knowledge, I wanted to see if I could help each leader build a better working relationship. Previously, I let CIO’s know about the emergence and significance of the strategic CFO. In today’s post, l will start by sharing the CIOs perspective on the CFO relationship and then I will discuss how CFOs can build better CIO relationships.
CIOs feel under the gun these days!
If you don’t know, CIOs feel under the gun these days. CIOs see their enterprises demanding ubiquitous computing. Users want to use their apps and expect corporate apps to look like their personal apps such as Facebook. They want to bring their own preferred devices. Most of all, , they want all their data on any device when they need it. This means CIOs are trying to manage a changing technical landscape of mobile, cloud, social, and big data. These are all vying for both dollars and attention. As a result, CIOs see their role in a sea change. Today, they need to focus less on building things and more on managing vendors. CIOs say that they need to 1) better connect what IT is doing to support the business strategy; 2) improve technical orchestration; and 3) improve process excellence. This is a big and growing charter.
CIOs see the CFO conversation being just about the numbers
CIOs worry that you don’t understand how many things are now being run by IT and that historical percentages of revenue may no longer appropriate. Think about healthcare, which used to be a complete laggard in technology but today it is having everything digitalized. Even a digital thermometer plugs into an iPad so it directly communicates with a patient record. The world has clearly changed. And CIOs worry that you view IT as merely a cost center and that you do not see the value generated through IT investment or the asset that information provides to business decision makers. However, the good news is that I believe that a different type of discussion is possible. And that CFOs have the opportunity to play an important role in helping to shape the value that CIOs deliver to the business.
CFOs should share their experience and business knowledge
CFOs that I talked to said that they believe the CFO/CIO relationship needs to be complimentary and that the roles have the most concentric rings. These CFOs believe that the stronger the relationship the better it is for their business. One area that you can help the CIO is in sharing your knowledge of the business and business needs. CIOs are trying to get closer to the business and you can help build this linkage and to support requests that come out of this process. Clearly, an aligned CFO can be “one of the biggest advocates of the CIO”. Given this, make sure that you are on your CIOs Investment Committee.
Tell your CIO about your data pains
CFOs need to be good customers too. CFOs that I talked to told me that they know their business has “a data issue”. They worry about the integrity of data from the source. CFOs see their role as relying increasingly on timely, accurate data. They, also, know they have disparate systems and too much manual stuff going on in the back office. For them, integration needs to exist from the frontend to the backend. Their teams personally feel the large number of manual steps.
For this reasons, CFOs, we talked to, believe that the integration of data is a big issue whether they are in a small or large business. Have you talked to your CIO about data integration or quality projects to change the ugliness that you have to live with day in day out? It will make you and the business more efficient. One CFO was blunt here saying “making life easier is all about the systems. If the systems suck then you cannot trust the numbers when you get them. You want to access the numbers easily, timely, and accurately. You want to make easier to forecast so you can set expectations with the business and externally”.
At the same time, CFOs that I talked to worried about the quality of financial and business data analysis. Once he had data, he worried about being able to analyze information effectively. Increasingly, CFOs say that they need to help drive synergies across their businesses. At the same time, CFOs increasingly need to manage upward with information. They want information for decision makers so they can make better decisions.
Changing the CIO Dialog
So it is clear that CFOs like you see data as a competitive advantage in particular financial data. The question is, as your unofficial therapist, why aren’t you having a discussion with your CIO not just about the numbers or financial justification for this or that system and instead, asking about the+ integration investment that can make your integration problems go away.
The strategic CFO is different than the “1975 Controller CFO”
Traditionally, CIOs have tended to work with what one CIO called a “1975 Controller CFO”. For this reason, the relationship between CIOs and CFOs was expressed well in a single word “contentious”. But a new type of CFO is emerging that offers the potential of different type of relationship. These so called “strategic CFOs” can be an effective ally for CIOs. The question is which type of CFO do you have? In this post, I will provide you with a bit of a litmus test so you can determine what type of CFO you have but more importantly, I will share how you can take maximum advantage of having a strategic-oriented CFO relationship. But first let’s hear a bit more of the CIOs reactions to CFOs.
Views of CIOs according to CIO interviews
Clearly, “the relationship…with these CFOs is filled with friction”. Controller CFOs “do not get why so many things require IT these days. They think that things must be out of whack. One CIO said that they think technology should only cost 2-3% of revenue while it can easily reach 8-9% of revenue these days.” Another CIO complained by saying their discussion with a Controller CFOs is only about IT productivity and effectiveness. In their eyes, this has limited the topics of discussion to IT cost reduction, IT produced business savings, and the soundness of the current IT organization. Unfortunately, this CIO believe that Controller CFOs are not concerned with creating business value or sees information as an asset. Instead, they view IT as a cost center. Another CIO says Controller CFOs are just about the numbers and see the CIO role as being about signing checks. It is a classic “demand versus supply” issue. At the same times, CIOs say that they see reporting to Controller CFO as a narrowing function. As well, they believe it signals to the rest of the organization “that IT is not strategic and less important than other business functions”.
What then is this strategic CFO?
In contrast to their controller peers, strategic CFOs often have a broader business background than their accounting and a CPA peers. Many have, also, pursued an MBA. Some have public accounting experience. Others yet come from professions like legal, business development, or investment banking.
More important than where they came from, strategic CFOs see a world that is about more than just numbers. They want to be more externally facing and to understand their company’s businesses. They tend to focus as much on what is going to happen as they do on what has happened. Remember, financial accounting is backward facing. Given this, strategic CFOs spend a lot of time trying to understand what is going on in their firm’s businesses. One strategic CFO said that they do this so they can contribute and add value—I want to be a true business leader. And taking this posture often puts them in the top three decision makers for their business. There may be lessons in this posture for technology focused CIOs.
Why is a strategic CFO such a game changer for CIO?
One CIO put it this way. “If you have a modern day CFO, then they are an enabler of IT”. Strategic CFO’s agree. Strategic CFOs themselves as having the “the most concentric circles with the CIO”. They believe that they need “CIOs more than ever to extract data to do their jobs better and to provide the management information business leadership needs to make better business decisions”. At the same time, the perspective of a strategic CFO can be valuable to the CIO because they have good working knowledge of what the business wants. They, also, tend to be close to the management information systems and computer systems. CFOs typically understand the needs of the business better than most staff functions. The CFOs, therefore, can be the biggest advocate of the CIO. This is why strategic CFOs should be on the CIOs Investment Committee. Finally, a strategic CFO can help a CIO ensure their technology selections meet affordability targets and are compliant with the corporate strategy.
Are the priorities of a strategic CFO different?
Strategic CFOs still care P&L, Expense Management, Budgetary Control, Compliance, and Risk Management. But they are also concerned about performance management for the enterprise as whole and senior management reporting. As well they, they want to do the above tasks faster so finance and other functions can do in period management by exception. For this reason they see data and data analysis as a big issue.
Strategic CFOs care about data integration
In interviews of strategic CFOs, I saw a group of people that truly understand the data holes in the current IT system. And they intuit firsthand the value proposition of investing to fix things here. These CFOs say that they worry “about the integrity of data from the source and about being able to analyze information”. They say that they want the integration to be good enough that at the push of button they can get an accurate report. Otherwise, they have to “massage the data and then send it through another system to get what you need”.
These CFOs say that they really feel the pain of systems not talking to each other. They understand this means making disparate systems from the frontend to the backend talk to one another. But they, also, believe that making things less manual will drive important consequences including their own ability to inspect books more frequently. Given this, they see data as a competitive advantage. One CFO even said that they thought data is the last competitive advantage.
Strategic CFOs are also worried about data security. They believe their auditors are going after this with a vengeance. They are really worried about getting hacked. One said, “Target scared a lot of folks and was to many respects a watershed event”. At the same time, Strategic CFOs want to be able to drive synergies across the business. One CFO even extolled the value of a holistic view of customer. When I asked why this was a finance objective versus a marketing objective, they said finance is responsible for business metrics and we have gaps in our business metrics around customer including the percentage of cross sell is taking place between our business units. Another CFO amplified on this theme by saying that “increasingly we need to manage upward with information. For this reason, we need information for decision makers so they can make better decisions”. Another strategic CFO summed this up by saying “the integration of the right systems to provide the right information needs to be done so we and the business have the right information to manage and make decisions at the right time”.
So what are you waiting for?
If you are lucky enough to have a Strategic CFO, start building your relationship. And you can start by discussing their data integration and data quality problems. So I have a question for you. How many of you think you have a Controller CFO versus a Strategic CFO? Please share here.
However, the nature of data integration has evolved, and so has the way we define the value. The operational benefits are still there, but there are more strategic benefits to consider as well.
Data integration patterns have progressed from simple patterns that replicated data amongst systems and data stores, to more service-based use of core business data that is able to provide better time-to-market advantages and much better agility. These are the strategic concepts that, when measured, add up to much more value than the simple operational advantages we first defined as the ROI of data integration.
The new ROI for data integration can be defined a few ways, including:
The use of data services to combine core data assets with composite applications and critical business processes. This allows those who leverage data services, which is a form of data integration, to mix and match data services to provide access to core applications or business processes. The applications leverage the data services (typically REST-based Web services) as ways to access back-end data stores, and can even redefine the metadata for the application or process (a.k.a., Data Virtualization).
This provides for a compressed time-to-market for critical business solutions, thus returning much in the way of investment. What’s more important is the enterprise’s new ability to change to adapt to new business opportunities, and thus get to the value of agility. This is clearly where the majority of ROI resides.
The use of integrated data to make better automated operational decisions. This means that we’re taking integrated data, either as services or through simple replication, or using that data to make automated decisions. Examples would be the ability to determine if inventory levels will support an increase in sales, or if the risk levels for financial trades are too high.
The use of big data analytics to define advanced use of data, including predicting the future. Refers to the process of leveraging big data, and big data analytics, to make critical calls around the business, typically calls that are more strategic in nature. An example would be the use of predictive analytics that leverages petabytes of data to determine if a product line is likely to be successful, or if the production levels will likely decline or increase. This is different than operational use of data, as we discussed previously, in that we’re making strategic versus tactical use of the information derived from the data. The ROI here, as you would guess, is huge.
A general pattern is that the ROI is much greater around data integration than it was just 5 years ago. This is due largely to the fact that enterprises understand that data is everything, when it comes to driving a business. The more effective the use of data, the better you can drive the business, and that means more ROI. It’s just that simple.
Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies“
In the media there is a constant discussion about a mismatch between the skills that education provides and the capabilities graduates bring to the work place. And, whether they are prepared for work. The lack of large data set use means that skills needed by employers may be missing. I will outline the skills that could be gained by working with large data sets.
Some types of data handling are just high volume. Business intelligence and analytics consume more data than 20 years ago. Handling the increasing volume is important. Research programming and data science are truly part of big data. Even if you are not doing data science, you may be preparing and handling the data sets. Some industries and organisations just have higher volumes of data. Retail is one example. Companies that used to have less volume are obtaining more data as they adapt to the big data world. We should expect the same trend to continue with organisations that have had higher data volumes in the past. They are going to have to handle a much bigger big data experience.
There are practical aspects to handling large data sets. These can lead to experience in storage management and design, data loading, query optimization, parallelization, bandwidth issues and data quality when large data sets are used. And when you take on those issues, architecture skills are needed and can be gained.
Today, the trends known as the Internet of Things, All Things Data, and Data First are forming. As a result there will be demand for graduates who are familiar with handling high volumes of data.
The responsibility for using a large data set falls to the student. Faculty staff need to encourage this though. They often set and guide the students’ goals. A number of large data sets that could be used by students are on the web. An example of one data set would be the Harvard Library Bibliographic Dataset available at http://openmetadata.lib.harvard.edu/bibdata. Another example is the City of Chicago that makes a number of datasets available for download in a wide range of standard formats at https://data.cityofchicago.org/. The advantage of public large data sets is the volume and the opportunity to assess the data quality of the data set. Public data sets can hold many records. They represent many more combinations than we can quickly generate by hand. Using even a small real world data set is a vast improvement over the likely limited number of variations in self-generated data. It may be even better than using a tool to generate data. Such data when downloaded can be manipulated and used as a base for loading.
Loading large data sets is part of being prepared. It requires the use of tools. These tools can be from loaders to full data integration tool suites. A good option for students who need to load data sets is PowerCenter Express. It was announced last year. It is free for use with up to 250,000 rows per day. It is an ideal way to experience a full enterprise data integration tool and work with significantly higher volumes.
Big Data is here and it is a growing trend. And so students need to work with larger data sets than before. It is also feasible. The tools and the data sets the students need to work with large data sets are available. Therefore, in view of the current trends, large data set use should become standard practice in computer science and related courses.