Hybrid is Better Than Pure Breed! And Data Must Be Managed in Hybrid Cloud!
What is it about us human beings? We have two beautiful breeds: a Poodle and a Labrador. Then, somebody feels the need to create a Labradoodle? (Though they are pretty cute!). Or we invent an electric car, but cannot let go of our reliance on the infrastructure, so we then invent a hybrid car. With the Labradoodle, clearly we think we are accomplishing new or better results with a hybrid. In the case of the hybrid car, we want to have our cake and eat it too — benefitting from new, innovative technology while keeping our investment in a known system.
In the case of Hybrid Cloud, well, it’s a little bit of both, isn’t it? We see the obvious benefits in a cloud architecture. We love the infinite and inexpensive capacity and the elasticity to quickly scale up and down to meet our business goals. We benefit from the agility to rapidly spin up new business capabilities in the cloud. We value the ability to spend only on what we need and when we need it, rather than make hefty upfront capital expenditures. Just as importantly, all this frees up our budgets and resources to invest in truly differentiating technical capabilities.
If we could just “forklift” our entire data center to the cloud, perhaps we just would. But, alas we cannot. Almost every company we speak with is telling us that they are redirecting workloads to the cloud. However, they cannot do it all at once. And they cannot do it overnight. Certain on-premise investments remain for now.
As you pursue your own journey to the cloud, you will undoubtedly find yourself with a hybrid cloud architecture. Some workloads running in public cloud, some in private cloud and others may remain on-premise. In fact, according to a 2015 study by the 451 Research1, nearly half of respondents are using public cloud or will be using it in the next six months; two thirds of respondents are already using SaaS (Software as a Service) or will be using it in the next six months; and, over a third are using PaaS (Platform as a Service) or will be using it in the next six months.
But what about managing your data in hybrid cloud architecture? As you might imagine, at Informatica, we have a few thoughts on this dilemma. Integrating disparate data has never been easy. In the hybrid cloud paradigm, this problem has exacerbated. Data is everywhere and multiplying at staggering rates. With the movement to cloud, we are actually making a conscious decision to further spread out our data to multiple cloud destinations. In fact, the nice and contained ‘on-prem’ universe we had before, now seems like a simple data management picture compared to what’s ahead!
As you redirect workloads to public cloud and manage initiatives, such as hybrid data warehousing and hybrid application integration, there are four major types of data management challenges that will surface time and time again:
- Data Connectivity: In a hybrid cloud architecture, we need to integrate and manage data from an increasingly growing number and types of data systems, which may reside in public cloud, in private cloud or on premise. And we need to deliver data faster than ever before. This dictates the need for a data management architecture that provides out-of-the box connectivity to any data source and target. Our need for speed requires native, high performance connectivity, which at the same time abstracts the native complexity from the developer. Because realistically, how many data systems can your average developer build native expertise in? Separating integration logic from underlying sources, greatly improves productivity. It also allows developers to easily reuse integration logic, such as mappings, across data sources and targets, further increasing speed and productivity. Make sure your data management solution provides robust connectivity to any data system, ranging from cloud to mainframe and anything in between.
- Scalability: As data volumes grow more than ever before, a big advantage of moving to the cloud is the ability to infinitely scale your environment and deliver high performance at a fraction of the cost. However, doesn’t it defeat the purpose of architecting a massive scale data warehouse if it takes an inordinate amount of time to extract and load the data from sources to the data warehouse? This is akin to using a straw to fill up the ocean. Similarly, moving vast amounts of data from on premise systems to public cloud can be a time-consuming process if not done with the right tools. Therefore, in order to truly benefit from the inherent petabyte scalability found in public cloud data systems, your architecture should include a data integration platform that is also designed for infinite scalability and high performance. Your data integration platform should be inherently capable of moving large volumes of data at lightning speeds. It should also be capable of dynamically scaling up and down with the changing needs of your environment.
- Data Visibility: As your data environment becomes more complex and intertwined, the need is more critical for full visibility into data flows throughout your environment. This need for data comprehension is true for a wide range of stakeholders within your organization. Business stakeholders, such as data analysts, need to fully understand the origins of data they are basing their analysis on. They need to know where the data came from, who touched the data in the course of its journey, and how it was transformed at any point along the way. All stakeholders, business and technical alike, need to have a common and consistent definition for business terms used in data management and analysis. For example, the term ‘revenue recognized’ should mean exactly the same to all the participants in a meeting to discuss end of quarter revenue. Developers must understand all the ways in which certain data is moved and transformed across the enterprise. This must happen so that when a code change is required they can analyze the risk of change to the organization and determine how systems are impacted. This allows developers to plan and productively execute changes. The only way to provide data comprehension to all your stakeholders is via a metadata-driven data management platform. Metadata is the backbone of a reliable, self-documenting data management solution and the foundation of any governance initiatives. In a complex hybrid cloud environment, chaos will reign without robust metadata management.
- Operational Control: In hybrid cloud, you will have multiple, complex business processes that span across your environment and touch multiple data systems and applications in cloud and on premise. More than ever before, a central point of control becomes critically important to the success of your business. In order to ensure operational confidence in mission-critical data integration processes, you need the ability to orchestrate, administer and monitor your production data as it flows through your end-to-end environment. So, whether you are moving data from on premise system to public cloud, loading your hybrid data warehouse or integrating across your hybrid application ecosystem, the ability to manage all of this from a central point of control is key.
We mentioned earlier in this blog-post data from the 451 Research. Carl Lehman from 451 Research has done extensive research on this topic and he will be joining Informatica in an upcoming webinar to discuss the inherent data management challenges in Hybrid Cloud and best practices for addressing these challenges.
If you are interested in learning more, please join us for this webinar with 451 Research and Informatica: Data Management in the Hybrid Cloud Era.
 451 Research, “Voice of the Enterprise: Cloud Computing, Worldwide & Regional Survey Results and Narratives,” Q3 2015