My first job out of college was to figure out how to get devices that monitored and controlled an advanced cooling and heating system to communicate with a centralized and automated control center. We ended up building custom PCs for the application, running a version of Unix (DOS would not cut it), and the PCs mounted in industrial cases would communicate with the temperature and humidity sensors, as well as turn on and turn off fans and dampers.
At then end of the day, this was a data integration, not an engineering problem, that we were attempting to solve. The devices had to talk to the PCs, and the PC had to talk to a centralized system (Mainframe) that was able to receive the data, as well as use that data to determine what actions to take. For instance, the ability determine that 78 degrees was too warm for a clean room, and that a damper had to be open and a fan turned on to reduce the temperature, and then turn off when the temperature returned to normal.
Back in the day, we had to create and deploy custom drivers and software. These days, most devices have well-defined interfaces, or APIs, that developers and data integration tools can access to gather information from that device. We also have high performing networks. Much like any source or target system, these devices produce data which is typically bound to a structure, and that data can be consumed and restructured to meet the needs of the target system.
For instance, data coming off a smart thermostat in your home may be in the following structure:
Device (char 10)
Date (char 8)
Temp (num 3)
You’re able to access this device using an API (typically a REST-based Web Service), which returns a single chunk of data which is bound to the structure, such as:
Then you can transform the structure into something that’s native to the target system that receives this data, as well as translate the data (e.g., converting the Data form characters to numbers). This is where data integration technology makes money for you, given its ability to deal with the complexity of translating and transforming the information that comes off the device, so it can be placed in a system or data store that’s able to monitor, analyze, and react to this data.
This is really what the IOT is all about; the ability to have devices spin out data that is leveraged to make better use of the devices. The possibilities are endless, as to what can be done with that data, and how we can better manage these devices. Data integration is key. Trust me, it’s much easier to integrate with devices these days than it was back in the day.
Thank you for reading about Data Integration with Devices! Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies“
However, the nature of data integration has evolved, and so has the way we define the value. The operational benefits are still there, but there are more strategic benefits to consider as well.
Data integration patterns have progressed from simple patterns that replicated data amongst systems and data stores, to more service-based use of core business data that is able to provide better time-to-market advantages and much better agility. These are the strategic concepts that, when measured, add up to much more value than the simple operational advantages we first defined as the ROI of data integration.
The new ROI for data integration can be defined a few ways, including:
The use of data services to combine core data assets with composite applications and critical business processes. This allows those who leverage data services, which is a form of data integration, to mix and match data services to provide access to core applications or business processes. The applications leverage the data services (typically REST-based Web services) as ways to access back-end data stores, and can even redefine the metadata for the application or process (a.k.a., Data Virtualization).
This provides for a compressed time-to-market for critical business solutions, thus returning much in the way of investment. What’s more important is the enterprise’s new ability to change to adapt to new business opportunities, and thus get to the value of agility. This is clearly where the majority of ROI resides.
The use of integrated data to make better automated operational decisions. This means that we’re taking integrated data, either as services or through simple replication, or using that data to make automated decisions. Examples would be the ability to determine if inventory levels will support an increase in sales, or if the risk levels for financial trades are too high.
The use of big data analytics to define advanced use of data, including predicting the future. Refers to the process of leveraging big data, and big data analytics, to make critical calls around the business, typically calls that are more strategic in nature. An example would be the use of predictive analytics that leverages petabytes of data to determine if a product line is likely to be successful, or if the production levels will likely decline or increase. This is different than operational use of data, as we discussed previously, in that we’re making strategic versus tactical use of the information derived from the data. The ROI here, as you would guess, is huge.
A general pattern is that the ROI is much greater around data integration than it was just 5 years ago. This is due largely to the fact that enterprises understand that data is everything, when it comes to driving a business. The more effective the use of data, the better you can drive the business, and that means more ROI. It’s just that simple.
Editor’s note: For more information on Data Integration, consider downloading “Data Integration for Dummies“
The growth of big data drives many things, including the use of cloud-based resources, the growth of non-traditional databases, and, of course, the growth of data integration. What’s typically not as well understood are the required patterns of data integration, or, the ongoing need for better and more innovative data cleansing tools.
Indeed, while writing Big Data@Work: Dispelling the Myths, Uncovering the Opportunities, Tom Davenport observed data scientists at work. During his talk at VentureBeat’s DataBeat conference, Davenport said data scientists would need better data integration and data cleansing tools before they’d be able to keep up with the demand within organizations.
But Davenport is not alone. Most who deploy big data systems see the need for data integration and data cleansing tools. In most instances, not having those tools in place hindered progress.
I would agree with Davenport, in that the number one impediment to moving to any type of big data is how to clean and move data. Addressing that aspect of big data is Job One for enterprise IT.
The fact is, just implementing Hadoop-based databases won’t make a big data system work. Indeed, the data must come from existing operational data stores, and leverage all types of interfaces and database models. The fundamental need to translate the data structure and content to effectively move from one data store (or stores, typically) to the big data systems has more complexities than most enterprises understand.
The path forward may require more steps than originally anticipated, and perhaps the whole big data thing was sold as something that’s much easier than it actually is. My role for the last few years is to be the guy who lets enterprises know that data integration and data cleansing are core components to the process of building and deploying big data systems. You may as well learn to deal with it early in the process.
The good news is that data integration is not a new concept, and the technology is more than mature. What’s more, data cleansing tools can now be a part of the data integration technology offerings, and actually clean the data as it moves from place to place, and do so in near real-time.
So, doing big data anytime soon? Now is the time to define your big data strategy, in terms of the new technology you’ll be dragging into the enterprise. It’s also time to expand or change the use of data integration and perhaps the enabling technology that is built or designed around the use of big data.
I hate to sound like broken record, but somebody has to say this stuff.
I recently came across an article from 2006, which is clearly out-of-date, but still a good read about the state of data integration eight years ago. “Data integration was hot in 2005, and the intense interest in this topic continues in 2006 as companies struggle to integrate their ever-growing mountain of data.
A TDWI study on data integration last November found that 69% of companies considered data integration issues to be a high or very high barrier to new application development. To solve this problem, companies are increasing their spending on data integration products.”
Business intelligence (BI) and data warehousing were the way to go at the time, and companies were spending millions to stand up these systems. Data integration was all massive data movements and manipulations, typically driven by tactical tools rather than true data integration solutions.
The issue I had at the time was the inability to deal with real-time operational data, and the cost of the technology and deployments. While these issues were never resolved with traditional BI and data warehousing technology, we now have access to databases that can manage over a petabyte of data, and the ability to cull through the data in seconds.
The ability to support massive amounts of data have reignited the interest in data integration. Up-to-the-minute operational data in these massive data stores is actually possible. We can now understand the state of the business as it happens, and thus make incremental adjustments based upon almost perfect information.
What this situation leads to is true value. We have delivery of the right information to the right people, at the right time, and the ability to place automated processes and polices around this data. Business becomes self-correcting and self-optimizing. The outcome is a business that is data-driven, and thus more responsive to the markets as well as to the business world itself.
However, big data is an impossible dream without a focus on how the data moves from place to place, using data integration best practices and technology. I guess we can call this big data integration, but it’s really the path to provide these massive data stores with the operational data required to determine the proper metrics for the business.
While data integration is not a new term. However the application of new ways to leverage and value data brings unprecedented new value to enterprises. Millions of dollars an hour of value are being delivered to Global 2000 organizations that leverage these emerging data integration approaches and technology. What’s more, data integration is moving from the tactical to the strategic budgets of IT.
So, what’s changed in eight years? We finally figured out how to get the value from our data, using big data and data integration. It took us long enough, but I’m glad it’s finally become a priority.
Looking for a data integration expert? Join the club. As cloud computing and big data become more desirable within the Global 2000, an abundance of data integration talent is required to make both cloud and big data work properly.
The fact of the matter is that you can’t deploy a cloud-based system without some sort of data integration as part of the solution. Either from on-premise to cloud, cloud-to-cloud, or even intra-company use of private clouds, these projects need someone who knows what they are doing when it comes to data integration.
While many cloud projects were launched without a clear understanding of the role of data integration, most people understand it now. As companies become more familiar with the could, they learn that data integration is key to the solution. For this reason, it’s important for teams to have at least some data integration talent.
The same goes for big data projects. Massive amounts of data need to be loaded into massive databases. You can’t do these projects using ad-hoc technologies anymore. The team needs someone with integration knowledge, including what technologies to bring to the project.
Generally speaking, big data systems are built around data integration solutions. Similar to cloud, the use of data integration architectural expertise should be a core part of the project. I see big data projects succeed and fail, and the biggest cause of failure is the lack of data integration expertise.
The demand for data integration talent has exploded with the growth of both big data and cloud computing. A week does not go by that I’m not asked for the names of people who have data integration, cloud computing and big data systems skills. I know several people who fit that bill, however they all have jobs and recently got raises.
The scary thing is, if these jobs go unfilled by qualified personnel, project directors may hire individuals without the proper skills and experience. Or worse, they may not hire anyone at all. If they plod along without the expertise required, in a year they’ll wonder why the systems are not sharing data the way they should, resulting in a big failure.
So, what can organizations do? You can find or build the talent you need before starting important projects. Thus, now is the time to begin the planning process, including how to find and hire the right resources. This might even mean internal training, hiring mentors or outside consultants, or working with data integration technology providers. Do everything necessary to make sure you get data integration done right the first time.
According to Health IT Portal, “Having an integrated health IT infrastructure allows a healthcare organization and its providers to streamline the flow of data from one department to the next. Not all health settings, however, find themselves in this situation. Either through business agreements or vendor selection processes, many a healthcare organization has to spend considerable time and resources getting their disparate health IT systems to talk to each.”
In other words, you can’t leverage Health Information Exchanges (HIEs) without a sound data integration strategy. This is something I’ve ranted about for years. The foundation of any entity-to-entity exchange, health, finance, or other, is that all relevant systems freely communicate, and thus able to consume and produce information that’s required by any information exchange.
The article cites the case of Memorial Healthcare, a community health care system in Owosso, MI. Memorial Healthcare has Meditech on the hospital side and Allscripts in its physician offices. Frank Fear, the CIO of Memorial Healthcare, spent the last few years working on solutions to enable data integration. The resulting solution between the two vendors’ offerings, as well as within the same system, is made up of both an EHR and a practice management solution.
Those in the world of healthcare are moving headlong into these exchanges. Most have no clue as to what must change within internal IT to get ahead of the need for the free flow of information. Moreover, there needs to be a good data governance strategy in place, as well as security, and a focus on compliance issues as well.
The reality is that, for the most part, data integration in the world of healthcare is largely ad-hoc, and tactical in nature. This has led to no standardized method for systems to talk one-to-another, and certainly no standard ways for data to flow out through exchanges. Think of plumbing that was built haphazardly and ad hoc over the years, with whatever was quick and easy. Now, you’ve finally turned on the water and there are many, many leaks.
In terms of data integration, healthcare has been underfunded for far too long. Now clear regulatory changes require better information management and security approaches. Unfortunately, healthcare IT is way behind, in terms of leveraging proper data integration approaches, as well as leveraging the right data integration technology.
As things change in the world of healthcare, including the move to HIEs, I suspect that data integration will finally get a hard look from those who manage IT in healthcare organizations. However, they need to do this with some sound planning, which should include an understanding of what the future holds in terms of information management, and how to create a common infrastructure that supports most of the existing and future use cases. Healthcare, you’re about 10 years behind, so let’s get moving this year.
Loraine Lawson does an outstanding job of covering the issues around government use of “data heavy” projects. This includes a report by the government IT site, MeriTalk.
“The report identifies five factors, which it calls the Big Five of IT, that will significantly affect the flow of data into and out of organizations: Big data, data center consolidation, mobility, security and cloud computing.”
MeriTalk surveyed 201 state and local government IT professionals, and found that, while the majority of organizations plan to deploy the Big Five, 94 percent of IT pros say their agency is not fully prepared. “In fact, if Big Data, mobile, cloud, security and data center consolidation all took place today, 89 percent say they’d need additional network capacity to maintain service levels. Sixty-three percent said they’d face network bottleneck risks, according to the report.”
This report states what most who work with the government already know; the government is not ready for the influx of data. Nor is the government ready for the different uses of data, and thus there is a large amount of risk as the amount of data under management within the government explodes.
Add issues with the approaches and technologies leveraged for data integration to the list. As cloud computing and mobile computing continue to rise in popularity, there is not a clear strategy and technology for syncing data in the cloud, or on mobile devices, with data that exists within government agencies. Consolidation won’t be possible without a sound data integration strategy, nor will the proper use of big data technology.
The government sees a huge wave of data heading for it, as well as opportunities with new technology such as big data, cloud, and mobile. However, there doesn’t seem to be an overall plan to surf this wave. According to the report, if they do wade into the big data wave, they are likely to face much larger risks.
The answer to this problem is really rather simple. As the government moves to take advantage of the rising tide of data, as well as new technologies, they need to be funded to get the infrastructure and the technology they need to be successful. The use of data integration approaches and technologies, for example, will return the investment ten-fold, if properly introduced into the government problem domains. This includes integration with big data systems, mobile devices, and, of course, the rising use of cloud-based platforms.
While data integration is not a magic bullet for the government, nor any other organization, the proper and planned use of this technology goes a long way toward reducing the inherent risks that the report identified. Lacking that plan, I don’t think the government will get very far, very fast.
Leo Eweani makes the case that the data tsunami is coming. “Businesses are scrambling to respond and spending accordingly. Demand for data analysts is up by 92%; 25% of IT budgets are spent on the data integration projects required to access the value locked up in this data “ore” – it certainly seems that enterprise is doing The Right Thing – but is it?”
Data is exploding within most enterprises. However, most enterprises have no clue how to manage this data effectively. While you would think that an investment in data integration would be an area of focus, many enterprises don’t have a great track record in making data integration work. “Scratch the surface, and it emerges that 83% of IT staff expect there to be no ROI at all on data integration projects and that they are notorious for being late, over-budget and incredibly risky.”
The core message from me is that enterprises need to ‘up their game’ when it comes to data integration. This recommendation is based upon the amount of data growth we’ve already experienced, and will experience in the near future. Indeed, a “data tsunami” is on the horizon, and most enterprises are ill prepared for it.
So, how do you get prepared? While many would say it’s all about buying anything and everything, when it comes to big data technology, the best approach is to splurge on planning. This means defining exactly what data assets are in place now, and will be in place in the future, and how they should or will be leveraged.
To face the forthcoming wave of data, certain planning aspects and questions about data integration rise to the top:
Performance, including data latency. Or, how quickly does the data need to flow from point or points A to point or points B? As the volume of data quickly rises, the data integration engines have got to keep up.
Data security and governance. Or, how will the data be protected both at-rest and in-flight, and how will the data be managed in terms of controls on use and change?
Abstraction, and removing data complexity. Or, how will the enterprise remap and re-purpose key enterprise data that may not currently exist in a well-defined and functional structure?
Integration with cloud-based data. Or, how will the enterprise link existing enterprise data assets with those that exist on remote cloud platforms?
While this may seem like a complex and risky process, think through the problems, leverage the right technology, and you can remove the risk and complexity. The enterprises that seem to fail at data integration do not follow that advice.
I suspect the explosion of data to be the biggest challenge enterprise IT will face in many years. While a few will take advantage of their data, most will struggle, at least initially. Which route will you take?
As covered by Loraine Lawson, “When it comes to data, the U.S. federal government is a bit of a glutton. Federal agencies manage on average 209 million records, or approximately 8.4 billion records for the entire federal government, according to Steve O’Keeffe, founder of the government IT network site, MeriTalk.”
Check out these stats, in a December 2013 MeriTalk survey of 100 federal records and information management professionals. Among the findings:
- Only 18 percent said their agency had made significant progress toward managing records and email in electronic format, and are ready to report.
- One in five federal records management professionals say they are “completely prepared” to handle the growing volume of government records.
- 92 percent say their agency “has a lot of work to do to meet the direction.”
- 46 percent say they do not believe or are unsure about whether the deadlines are realistic and obtainable.
- Three out of four say the Presidential Directive on Managing Government Records will enable “modern, high-quality records and information management.”
I’ve been working with the US government for years, and I can tell that these facts are pretty accurate. Indeed, the paper glut is killing productivity. Even the way they manage digital data needs a great deal of improvement.
The problem is that the issues are so massive that’s it’s difficult to get your arms around it. Just the DOD alone has hundreds of thousands of databases on-line, and most of them need to exchange data with other systems. Typically this is done using old fashion approaches, including “sneaker-net,” Federal Express, FTP, and creaky batching extracts and updates.
The “digital data diet,” as Loraine calls it, really needs to start with a core understanding of most of the data under management. That task alone will take years, but, at the same time, form an effective data integration strategy that considers the dozens of data integration strategies you likely formed in the past that did not work.
The path to better data management in the government is one where you have to map out a clear path from here to there. Moreover, you need to make sure you define some successes along the way. For example, the simple reduction of manual and paper processes by 5 or 10 percent would be a great start. It’s something that would save the tax payers billions in a short period of time.
Too many times the government gets too ambitious around data integration, and attempts to do too much in too short an amount of time. Repeat this pattern and you’ll find yourself running in quicksand, and really set yourself up for failure.
Data integration is game-changing technology. Indeed, the larger you are, the more game-changing it is. You can’t get much larger than the US government. Time to get to work.
Marketing is changing how we leverage data. In the past, we had rudimentary use of data to understand how marketing campaigns affect demand. Today, we focus on the customer. The shift is causing those in marketing to get good at data, and good at data integration. These data points are beginning to appear, as are the clear and well-defined links between data integration and marketing.
There is no better data point than Yesmail Interactive’s recent survey of 100 senior-level marketers at companies with online and offline sales models, and $10 million to more than $1 billion in revenues. My good friend, Loraine Lawson, outlined this report in a recent blog.
The resulting report, “Customer Lifecycle Engagement: Imperatives for mid-to-large companies,” (link requires sign up) shows many midsize and large B2C “marketers lack the data and technology they need for more effective segmentation.”
The report lists a few proof points:
- 86 percent of marketers say they could generate more revenue from customers if they had access to a more complete picture of customer attributes.
- 34 percent cited both poor data quality and fragmented systems as among the most significant barriers to personalized customer communications.
- On a similar note, only 46 percent were satisfied with data quality.
- 48 percent were satisfied with their web analytics integration.
- 47 percent were satisfied with their customer data integration.
- 41 percent of marketers incorporate web browsing and online behavior data in targeting criteria—although one-third said they plan to leverage this source in the future.
- Only 20 percent augment in-house customer data with third-party data at the customer level.
- Only 24 percent augment customer data at an aggregate level (such as the industry or region). Compare that to 58 percent who say they either purchase or plan to purchase third-party data to augment customer records, primarily to “validate data integrity.”
Considering this data, it’s pretty easy to draw the conclusions that those in marketing don’t have access to the customer data required to effectively do their jobs. Thus, those in enterprise IT who support marketing should take steps to leverage the right data integration processes and technologies to provide them access to the necessary analytical data.
The report includes a list of key recommendations, all of which center around four key strategic imperatives:
- Marketing data must shift from stagnant data silos to real-time data access.
- Marketing data must shift from campaign-centric to customer-centric.
- Marketing data must shift from non-integrated multichannel to integrated multichannel. Marketing must connect analytics, strategy and the creative.
If case you have not noticed, in order to carry out these recommendations, you need a sound focus on data integration, as well as higher-end analytical systems, which will typically leverage big data-types of technologies. For those in marketing, the effective use of customer and other data is key to understanding their marketplace, which is key to focusing marketing efforts and creating demand. The links with marketing and data integration are stronger than ever.