How to Manage the Complexity of Cloud Analytics
Welcome back to our series of Analytics Chalk Talk videos. I hope you enjoyed our last video: “What’s Next for Big Data Analytics?” This week, I want to discuss how to support the popularity of cloud analytics without creating yet another data silo. Below is a transcript from the video as well as the video itself. Watch and please share your views. Find me on Twitter @leanlyle.
One of the big questions I get asked a lot is “How do I use cloud analytics and avoid the chaos caused by all these the different silos?”
Well, the answer is, “very carefully.”
The big cloud concerns
There are a lot of advantages to using the cloud for analytics. But because of the speed, agility, and ease with which new technologies can be set up, there’s also a lot of chaos.
Let’s start with our typical environment.
Here’s what we’ve been used to in the world of data warehousing and analytics: I’ve got my data sources, I go through my data integration, data quality, (or however you’ve architected this into your on-premise data warehouse). Now, this is a radically oversimplified picture of course, and you’ve done this in a way that’s far more detailed and appropriate. The point is, as we move into the cloud, the important thing to realize is that, data silos have been sprouting up over the years anyway, right? Even before the cloud came along, we’ve had silos.
The only difference is that the cloud is speeding up the agility, or the ease with which the business or any other team can create a new silo.
In fact, what I would recommend is, grab your credit card and go to Amazon Web Services, or Azure, or the Google Compute platform, and spin up your own Linux instance. Fire up your own database, and see how easy it is for you to do that. What you’ll see is that these new technologies really lower the barriers to how quickly somebody can create a new silo.
So, we have to get out in front of this.
What to do about it
The first thing you really need to get experience with, is to start using some of the cloud integration services that are out there, specifically Integration Platform as a Service, or iPaaS.
Now you’ve got a new user interface that has been rethought, radically simplified, and put out on the cloud. So you and your users can use wizards to create clean and connect data and applications much faster than you could when you were using on-premise enterprise software.
This is important because these are the kinds of things your lines of business colleagues and business or data analysts have available to them as well. So, you need to know this, to know how fast they can work. The good news is that these are also useful tools for you.
In fact, iPaaS will likely replace some of the things that have been done with on-premise tools in the past. Additionally, what this allows you to do is to look at a range of new analytics services like Amazon Redshift, different column store databases, different NoSQL systems, and so on.
So what we have is a range of different tools in our toolkit. But what we really need is a way to feed all of them with the right data—to be able to manage it and control it. So, you really see what’s become a triangle, an interchange of data between these different systems.
Visibility is key
The most important thing is to make sure you can see what’s going on. That you have constant transparency of all your metadata—who’s doing what, when are they doing it, what did they do, how long did it take them, did anything go wrong—you want that full audit trail of what’s going on.
So what you’re working toward is getting out ahead of the business. You’re giving them almost a provisioned self-service capability so they can do what they need to do, but at the same time, you’re in a position to manage that service. You need to be able to see what they’re using and how they’re using it so you can help them work in faster and smarter ways. So iPaaS is about making sure you now have better oversight over what they’re doing than you used to.
So coming back to our initial question: how do you take advantage of cloud services without losing control and creating more chaos? You do it by looking at iPaaS and cloud integration services, and provisioning that out to the business.