Informatica recently released the findings of a survey (entitled “Data is Holding You Back from Analytics Success”) in which respondents revealed that 85% of are effective at putting financial data to use to inform decision making. However, it also discovered that many are less confident about putting data to use to inform patient engagement initiatives that require access to external data and big data, which they note to be more challenging.
The idea is that data unto itself does not carry that much value. For example, I’ve been gathering data with my fitbit for over 90 days. A use of that data could be to looking at patterns that might indicate I’m more likely to have heart attack. However, this can only be determined if we compare my data with external historical patient data that exists in a large analytical database (big data).
The external data provides the known patterns that lead to known outcomes. Thus, when compared with my data, predictive analytics can occur. In other words, we can use data integration as a way to mash up and analyze the data so it has more meaning and value. In this case, perhaps having me avoid a future heart attack.
Inter-organizational transformational business processes require information sharing between data sources, and yet, according to the Informatica Survey, over 65% of respondents say data integration and data quality are significantly challenging. Thus, healthcare providers collect data, but many have yet to integrate these data silos to realize its full potential.
Indeed, the International Institute of Analytics, offered a view of the healthcare analytics maturity by looking at more than 20 healthcare provider organizations. The study validated the fact that, while healthcare providers are indeed gathering the EMR data, they are not acting upon the data in meaningful ways.
The core problem is a lack of understanding of the value that this data can bring. Or, perhaps the lack of a budget for the right technology. Much as my Fitbit could help me prevent a future heart attack by tracking my activity data, healthcare providers can use their data to become more proactive around health issues.
Better utilization of this data will reduce costs by leveraging predictive analytics to take more preventative measures. For instance, automatically culling through the family tree of a patient to determine risks for cancer, heart disease, etc., and automatically scheduling specific kinds of tests that are not normally given unless the patient is symptomatic.
Of course, putting your data to work is not free. It’s going to take some level of effort to create strategies, and acquire and deploy data integration technology. However, the benefits are easy to define, thus the business case is easy to create as well.
For myself, I’ll keep gathering data. Hopefully it will have some use, someday.
For example, drones fly over a cornfield to gather data that will determine the effectiveness of irrigation. This process collects gigabytes of data that can be analyzed to determine where the farmer needs to address issues that reduce the yield of the field. In another example, an MRI gathers massive amounts of data in a single scan that are analyzed along with past diagnoses and outcome data to determine what’s going on now, as well as what will likely go on in the future, based upon patterns that it sees in the scan. In yet another example, a jet engine produces data during a flight that, when analyzed using predictive analytics, lets the pilots know that it’s about 5 hours away from a complete failure.
The use cases for IoT are expansive. They are all data driven, and just like applications that produce data, care must be given as to how the data is integrated with other systems. Thus, the two most important concepts of IoT include data integration and data analytics.
These days, anyone who builds IoT systems needs to understand a few new requirements for IoT data integration technology:
First, the volume of data will increase significantly, and the speed with which the data is transmitted will be near-time or real-time streaming. Message-based data integration approaches may not scale well using older data integration approaches.
Second, at the same time, the data quality must be checked at rest and in flight, and must be placed in a data store where it can be analyzed. Typically, immediately. Bad data ruins the value of IoT. Considering that devices produce all types of data in all types of unstructured states, the ability to place policies to perform data quality checks and data governance is an imperative.
Finally, in many instances, data integration approaches and technology will have to combine data on the fly. For instance, the ability to assign rankings for level of irrigation out of an existing database, using data gathered in real time from the drones flying overhead.
IoT meshes nicely with both cloud and big data. Indeed, most IoT applications and data will find that they are more cost effective when hosted in the cloud. Real time data analytics that allow us to gather value from IoT systems come directly from emerging big data technology.
IoT is changing the game as to how we gather and deal with real-time data. It’s changing our lives, in terms of having a better relationship with technology, and finally gives us the data to proactively solve problems.
Key to making IoT work is having a sound data integration strategy and technology implementation. If IoT is in your future, now is the time to figure this out.
In case you haven’t noticed, data integration is all the rage right now. Why? There are three major reasons for this trend that we’ll explore below, but a recent USA Today story focused on corporate data as a much more valuable asset than it was just a few years ago. Moreover, the sheer volume of data is exploding.
For instance, in a report published by research company IDC, they estimated that the total count of data created or replicated worldwide in 2012 would add up to 2.8 zettabytes (ZB). By 2020, IDC expects the annual data-creation total to reach 40 ZB, which would amount to a 50-fold increase from where things stood at the start of 2010.
But the growth of data is only a part of the story. Indeed, I see three things happening that drive interest in data integration.
First, the growth of cloud computing. The growth of data integration around the growth of cloud computing is logical, considering that we’re relocating data to public clouds, and that data must be synced with systems that remain on-premise.
The data integration providers, such as Informatica, have stepped up. They provide data integration technology that can span enterprises, managed service providers, and clouds that dealing with the special needs of cloud-based systems. Moreover, at the same time, data integration improves the ways we doing data governance, and data quality,
Second, the growth of big data. A recent IDC forecast shows that the big data technology and services market will grow at a 26.4% compound annual growth rate to $41.5 billion through 2018, or, about six times the growth rate of the overall information technology market. Additionally, by 2020, IDC believes that line of business buyers will help drive analytics beyond its historical sweet spot of relational to the double-digit growth rates of real-time intelligence and exploration/discovery of the unstructured worlds.
The world of big data razor blades around data integration. The more that enterprises rely on big data, and the more that data needs to move from place to place, the more a core data integration strategy and technology is needed. That means you can’t talk about big data without talking about big data integration.
Data integration technology providers have responded with technology that keeps up with the volume of data that moves from place to place. As linked to the growth of cloud computing above, providers also create technology with the understanding that data now moves within enterprises, between enterprises and clouds, and even from cloud to cloud. Finally, data integration providers know how to deal with both structured and unstructured data these days.
Third, better understanding around the value of information. Enterprise managers always knew their data was valuable, but perhaps they did not understand the true value that it can bring.
With the growth of big data, we now have access to information that helps us drive our business in the right directions. Predictive analytics, for instance, allows us to take years of historical data and determine patterns that allow us to predict the future. Mashing up our business data with external data sources makes our data even more valuable.
Of course, data integration drives much of this growth. Thus the refocus on data integration approaches and tech. There are years and years of evolution still ahead of us, and much to be learned from the data we maintain.
As reported by the Economic Times, “In the coming years, enormous volumes of machine-generated data from the Internet of Things (IoT) will emerge. If exploited properly, this data – often dubbed machine or sensor data, and often seen as the next evolution in Big Data – can fuel a wide range of data-driven business process improvements across numerous industries.”
We can all see this happening in our personal lives. Our thermostats are connected now, our cars have been for years, even my toothbrush has a Bluetooth connection with my phone. On the industrial sides, devices have also been connected for years, tossing off megabytes of data per day that have been typically used for monitoring, with the data tossed away as quickly as it appears.
So, what changed? With the advent of big data, cheap cloud, and on-premise storage, we now have the ability to store machine or sensor data spinning out of industrial machines, airliners, health diagnostic devices, etc., and leverage that data for new and valuable uses.
For example, the ability determine the likelihood that a jet engine will fail, based upon the sensor data gathered, and how that data compared with existing known patterns of failure. Instead of getting an engine failure light on the flight deck, the pilots can see that the engine has a 20 percent likelihood of failure, and get the engine serviced before it fails completely.
The problem with all of this very cool stuff is that we need to once again rethink data integration. Indeed, if the data can’t get from the machine sensors to a persistent data store for analysis, then none of this has a chance of working.
That’s why those who are moving to IoT-based systems need to do two things. First, they must create a strategy for extracting data from devices, such as industrial robots or ann Audi A8. Second, they need a strategy to take all of this disparate data that’s firing out of devices at megabytes per second, and put it where it needs to go, and in the right native structure (or in an unstructured data lake), so it can be leveraged in useful ways, and in real time.
The challenge is that machines and devices are not traditional IT systems. I’ve built connectors for industrial applications in my career. The fact is, you need to adapt to the way that the machines and devices produce data, and not the other way around. Data integration technology needs to adapt as well, making sure that it can deal with streaming and unstructured data, including many instances where the data needs to be processed in flight as it moves from the device, to the database.
This becomes a huge opportunity for data integration providers who understand the special needs of IoT, as well as the technology that those who build IoT-based systems can leverage. However, the larger value is for those businesses that learn how to leverage IoT to provide better services to their customers by offering insights that have previously been impossible. Be it jet engine reliability, the fuel efficiency of my car, or feedback to my physician from sensors on my body, this is game changing stuff. At the heart of its ability to succeed is the ability to move data from place-to-place.
As reviewed by Loraine Lawson, a MeriTalk survey about cloud adoption found that a “In the latest survey of 150 federal executives, nearly one in five say one-quarter of their IT services are fully or partially delivered via the cloud.”
For the most part, the shifts are more tactical in nature. These federal managers are shifting email (50 percent), web hosting (45 percent) and servers/storage (43 percent). Most interesting is that they’re not moving traditional business applications, custom business apps, or middleware. Why? Data, and data integration issues.
“Federal agencies are worried about what happens to data in the cloud, assuming they can get it there in the first place:
- 58 percent of executives fret about cloud-to-legacy system integration as a barrier.
- 57 percent are worried about migration challenges, suggesting they’re not sure the data can be moved at all.
- 54 percent are concerned about data portability once the data is in the cloud.
- 53 percent are worried about ‘contract lock-in.’ ”
The reality is that the government does not get much out of the movement to cloud without committing core business applications and thus core data. While e-mail and Web hosting, and some storage is good, the real cloud computing money is made when moving away from expensive hardware and software. Failing to do that, you fail to find the value, and, in this case, spend more taxpayer dollars than you should.
Data issues are not just a concern in the government. Most larger enterprise have the same issues as well. However, a few are able to get around these issues with good planning approaches and the right data management and data integration technology. It’s just a matter of making the initial leap, which most Federal IT executives are unwilling to do.
In working with CIOs of Federal agencies in the last few years, the larger issue is that of funding. While everyone understands that moving to cloud-based systems will save money, getting there means hiring government integrators and living with redundant systems for a time. That involves some major money. If most of the existing budget goes to existing IP operations, then the move may not be practical. Thus, there should be funds made available to work on the cloud projects with the greatest potential to reduce spending and increase efficiencies.
The shame of this situation is that the government was pretty much on the leading edge with cloud computing. back in 2008 and 2009. The CIO of the US Government, Vivek Kundra, promoted the use of cloud computing, and NIST drove the initial definitions of “The Cloud,” including IaaS, SaaS, and PaaS. But, when it came down to making the leap, most agencies balked at the opportunity citing issues with data.
Now that the technology has evolved even more, there is really no excuse for the government to delay migration to cloud-based platforms. The clouds are ready, and the data integration tools have cloud integration capabilities backed in. It’s time to see some more progress.
Back in 2004, we saw the rapid growth of SaaS providers such as Salesforce.com. However, there was typically no consistent data integration strategy to go along with the use of SaaS. In many instances, SaaS-delivered applications became the new data silos in the enterprise, silos that lacked a sound integration plan and integration technology.
10 years later, we’ve gotten to a point where we have the ability to solve problems using SaaS and data integration problems around the use of SaaS. However, we typically lack the knowledge and understanding of how to effectively use data integration technology within an enterprise to integrate SaaS problem domains.
Lawson looks at both sides of the SaaS integration argument. “Surveys certainly show that integration is less of a concern for SaaS than in the early days, when nearly 88 percent of SaaS companies said integration concerns would slow down adoption and more than 88 percent said it’s an important or extremely important factor in winning new customers.”
Again, while we’ve certainly gotten better at integration, we’re nowhere near being out of the woods. “A Dimensional Research survey of 350 IT executives showed that 67 percent cited data integration problems as a challenge with SaaS business applications. And as with traditional systems, integration can add hidden costs to your project if you ignore it.”
As I’ve stated many times in this blog, integration requires a bit of planning and the use of solid technology. While this does require some extra effort and money, the return on the value of this work is huge.
SaaS integration requires that you take a bit of a different approach than traditional enterprise integration. SaaS systems typically place your data behind well-defined APIs that can be accessed directly or through a data integration technology. While the information can be consumed by anything that can invoke an API, enterprises still have to deal with structure and content differences, and that’s typically best handled using the right data integration technology.
Other things to consider, things that are again often overlooked, is the need for both data governance and data security around your SaaS integration solution. There should be a centralized control mechanism to support the proper management and security of the data, as well as a mechanism to deal with data quality issues that often emerge when consuming data from any cloud computing services.
The reality is that SaaS is here to stay. Even enterprise software players that put off the move to SaaS-delivered systems, are not standing up SaaS offerings. The economics around the use of SaaS are just way to compelling. However, as SaaS-delivered systems become more common place, so will the emergence of new silos. This will not be an issue, if you leverage the right SaaS integration approach and technology. What will your approach be?
It’s true. Data integration is a whole new game, compared to five years ago, or, in some organizations, five minutes ago. The right approaches to data integration continue to evolve around a few principal forces: First, the growth of cloud computing, as pointed out by Stafford. Second, the growing use of big data systems, and the emerging use of data as a strategic asset for the business.
These forces combine to drive us to the understanding that old approaches to data integration won’t provide the value that they once did. As someone who was a CTO of three different data integration companies, I’ve seen these patterns change over the time that I was building technology, and that change has accelerated in the last 7 years.
The core opportunities lie with the enterprise architect, and their ability to drive an understanding of the value of data integration, as well as drive change within their organization. After all, they, or the enterprises CTOs and CIOs (whomever makes decisions about technological approaches), are supposed to drive the organization in the right technical directions that will provide the best support for the business. While most enterprise architects follow the latest hype, such as cloud computing and big data, many have missed the underlying data integration strategies and technologies that will support these changes.
“The integration challenges of cloud adoption alone give architects and developers a once in a lifetime opportunity to retool their skillsets for a long-term, successful career, according to both analysts. With the right skills, they’ll be valued leaders as businesses transition from traditional application architectures, deployment methodologies and sourcing arrangements.”
The problem is that, while most agree that data integration is important, they typically don’t understand what it is, and the value it can bring. These days, many developers live in a world of instant updates. With emerging DevOps approaches and infrastructure, they really don’t get the need, or the mechanisms, required to share data between application or database silos. In many instances, they resort to coding interfaces between source and target systems. This leads to brittle and unreliable integration solutions, and thus hurts and does not help new cloud application and big data deployments.
The message is clear: Those charged with defining technology strategies within enterprises need to also focus on data integration approaches, methods, patterns, and technologies. Failing to do so means that the investments made in new and emerging technology, such as cloud computing and big data, will fail to provide the anticipated value. At the same time, enterprise architects need to be empowered to make such changes. Most enterprises are behind on this effort. Now it’s time to get to work.
The articles cites some research from Ovum, that predicts many enterprises will begin moving toward data integration, driven largely by the rise of cloud computing and big data. However, enterprises need to invest in both modernizing the existing data management infrastructure, as well as invest in data integration technology. “All of these new investments will push the middleware software market up 9 percent to a $16.3 billion industry, Information Management reports.” This projection is for 2015.
I suspect that’s a bit conservative. In my travels, I see much more interest in data integration strategies, approaches, and technology, as cloud computing continues to grow, as well as enterprises understand better the strategic use of data. So, I would put the growth at 15 percent for 2015.
There are many factors driving this growth, beyond mere interest in cloud computing and big data.
The first consideration is that data is more strategic than initially understood. While businesses have always considered data a huge asset, it has not been until the last few years that businesses have seen the true value of understanding what’s going on inside, and outside of their business.
Manufacturing companies want to see the current state of production, as well as production history. Management can now use that data to predict trends to address, such as future issues around employee productivity, or even a piece of equipment that is likely to fail and the impact of that failure on revenue. Healthcare companies are learning how to better monitor patient health, such as spotting likely health problems before they are diagnosed, or leveraging large data to understand when patterns emerge around health issues, such as areas of the country that are more prone to asthma, based upon air quality.
Second, there is the need to deal with compliance issues. The new health care regulations, or even the new regulation around managing a publically traded company, require a great deal of data management issues, including data integration.
As these laws emerge, and are altered over time, the reporting requirements are always more complex and far reaching than they were before. Those who want to avoid fines, or even avoid stock drops around mistakes, are paying close attention to this area.
Finally, there is an expectation from customers and employees that you will have a good handle on your data. 10 years ago you could tell a customer on the phone that you needed to check different systems to answer their question. Those days are over. Today’s customers and employees want immediate access to the data they need, and there is no good excuse for not being able to produce that data. If you can’t, your competition will.
The interest in data integration will experience solid growth in 2015, around cloud and big data, for sure. However, other factors will drive this growth, and enterprises will finally understand that data integration is core to an IT strategy, and should never be an afterthought.
According to the article, in Hamilton County Ohio, it’s not unusual to see kids from the same neighborhoods coming to the hospital for asthma attacks. Thus, researchers wanted to know if it was fact or mistaken perception that an unusually high number of children in the same neighborhood were experiencing asthma attacks. The next step was to review existing data to determine the extent of the issues, and perhaps how to solve the problem altogether.
“The researchers studied 4,355 children between the ages of 1 and 16 who visited the emergency department or were hospitalized for asthma at Cincinnati Children’s between January 2009 and December 2012. They tracked those kids for 12 months to see if they returned to the ED or were readmitted for asthma.”
Not only were the researchers able to determine a sound correlation between the two data sets, but they were able to advance the research to predict which kids were at high-risk based upon where they live. Thus, some of the cause and the effects have been determined.
This came about when researchers began thinking out of the box, when it comes to dealing with traditional and non-traditional medical data. They integrated housing and census data, in this case, with that of the data from the diagnosis and treatment of the patients. These are data sets unlikely to find their way to each other, but together they have a meaning that is much more valuable than if they just stayed in their respective silos.
“Non-traditional medical data integration has begun to take place in some medical collaborative environments already. The New York-Presbyterian Regional Health Collaborative created a medical village, which ‘goes beyond the established patient-centered medical home mode.’ It not only connects an academic medical center with a large ambulatory network, medical homes, and other providers with each other, but community resources such as school-based clinics and specialty-care centers (the ones that are a part of NYP’s network).”
The fact of the matter is that data is the key to understanding what the heck is going on when cells of sick people begin to emerge. While researchers and doctors can treat the individual patients there is not a good understanding of the larger issues that may be at play. In this case, poor air quality in poor neighborhoods. Thus, they understand what problem needs to be corrected.
The universal sharing of data is really the larger solution here, but one that won’t be approached without a common understanding of the value, and funding. As we pass laws around the administration of health care, as well as how data is to be handled, perhaps it’s time we look at what the data actually means. This requires a massive deployment of data integration technology, and the fundamental push to share data with a central data repository, as well as with health care providers.
California reported a total of 167 data breaches in 2013, which is up 28 percent from the 2012. Two major data breaches caused most of this uptick, including the Target attack that was reported in December 2013, and the LivingSocial attack that occurred in April 2013. This year, you can add the Home Depot data breach to that list, as well as the recent breach at the US Post Office.
So, what the heck is going on? And how does this new impact data integration? Should we be concerned, as we place more and more data on public clouds, or within big data systems?
Almost all of these breaches were made possible by traditional systems with security technology and security operations that fell far enough behind that outside attackers found a way in. You can count on many more of these attacks, as enterprises and governments don’t look at security as what it is; an ongoing activity that may require massive and systemic changes to make sure the data is properly protected.
As enterprises and government agencies stand up cloud-based systems, and new big data systems, either inside (private) or outside (public) of the enterprise, there are some emerging best practices around security that those who deploy data integration should understand. Here are a few that should be on the top of your list:
First, start with Identity and Access Management (IAM) and work your way backward. These days, most cloud and non-cloud systems are complex distributed systems. That means IAM is is clearly the best security model and best practice to follow with the emerging use of cloud computing.
The concept is simple; provide a security approach and technology that enables the right individuals to access the right resources, at the right times, for the right reasons. The concept follows the principle that everything and everyone gets an identity. This includes humans, servers, APIs, applications, data, etc.. Once that verification occurs, it’s just a matter of defining which identities can access other identities, and creating policies that define the limits of that relationship.
Second, work with your data integration provider to identify solutions that work best with their technology. Most data integration solutions address security in one way, shape, or form. Understanding those solutions is important to secure data at rest and in flight.
Finally, splurge on monitoring and governance. Many of the issues around this growing number of breaches exist with the system managers’ inability to spot and stop attacks. Creative approaches to monitoring system and network utilization, as well as data access, will allow those in IT to spot most of the attacks and correct the issues before the ‘go nuclear.’ Typically, there are an increasing number of breach attempts that lead up to the complete breach.
The issue and burden of security won’t go away. Systems will continue to move to public and private clouds, and data will continue to migrate to distributed big data types of environments. And that means the need data integration and data security will continue to explode.