Metadata is Your Most Valuable Asset in a Multi-cloud World
It feels a bit obvious, at this point, to refer to data as a company’s most valuable asset. If there are two truths in today’s digital world, it’s that data determines success, and that there’s an awful lot of data to manage. And given the latter challenge, I propose a new axiom: Metadata is actually your most valuable asset.
This is particularly true as we move into an increasingly multi-cloud world. For many customers I talk to, “the cloud” is already many clouds: the big platforms, like AWS and Azure, plus specific platforms or applications such as Oracle, Salesforce, Workday, and so on.
Although the initial, futuristic promise was that cloud architectures would simplify our lives, the realities of multiple-cloud architecture introduces new complexity and integration challenges. Basically, we have the potential to end up right back where we started, with all the silos and disconnections we suffered in the pre-cloud world. To avoid that trap, we need to more fully understand, and trust, the data.
It’s not easy. I can’t count the customers who’ve explained to me that their organizations have systems with multiple, conflicting definitions of seemingly simple terms like “revenue,” “sales,” or “customer.” And interestingly, the challenge isn’t necessarily, “How do I make every system (and line of business) define a customer in the exact same way?” The challenge is, “How can I always know which set of data defines customers in the way that’s appropriate for a specific use (for example, do I want a list of individual buyers or just a list of companies to which we sell our products)?
Fortunately, the tools and enabling technologies available have evolved to make it much easier to make a multi-cloud hybrid architecture successful. No longer do we have to rely on spreadsheets and SharePoint to track and manage this stuff. Which brings us back to the value of metadata.
Cataloging what you know
Metadata, among many other important uses, gives you the needed context to trust and relate sets of data. Metadata includes the definition, lineage, and governance model around our data. It tells us how current the data is, and which systems generated it.
The repository for all that metadata is the data catalog, and I think it’s important to view that catalog as an essential element of your data strategy. Data catalogs have developed quickly, incorporating automated tools that can scan our environments and use artificial intelligence to get a faster, more intelligent read on all our data, in all our systems. The catalog also authoritatively documents stewardship and processes.
Centralizing your metadata, the catalog becomes a universal decoder that lets you know the source and the definition of your data, so that analysts can understand the differences between data sets. Now you can pursue a “customer 360” initiative, accessing all relevant data—while understanding which “customer data” refers to a company, which defines the individual buyers of your product within that company, and which associated data sets are counting product sales by company versus end users.
Data drives your cloud implementations
In dealing with a multi-cloud architecture, there are likely to be two challenges. If you have a well-managed ecosystem, you’ll just have to make sure that each new instance is correctly added to the map. If today is the day you’ll start to establish that cohesive architectural governance, you’ve got to draw that first, accurate map.
When you’re just bringing in one new cloud environment—such as a new instance of Salesforce—make sure the metadata is solid, now and for the future. Are you starting the new instance from a master customer list? Does one need to be created? And what process is in place for adding new customer data over time? How will you govern the process to be sure that individual sales reps won’t be inputting incomplete and/or duplicative entries? Whatever the application or environment, it’s vital to define both your starting point and the road ahead for the data.
It’s a similar—though decidedly bigger—project to get your first true grip on a multi-cloud ecosystem that has evolved somewhat haphazardly over time. Again, you’ll focus on metadata. Take a good inventory of what’s in your existing cloud applications, looking at how well you’ve been applying best practices to the data. Identify the most significant problem areas and figure out whether you’ll start applying new business rules to clean up the data, or draw a line and only apply the new practices to data going forward. (Sometimes it’s more efficient to say, “From here forward, the data is well-governed and completely trusted. Earlier data has to be viewed as imperfect.”)
Context is key
The challenges that emerge as we expand into a multi-cloud architecture highlight one of the most important considerations of any digital transformation: We have to completely reimagine how IT and the business work together, without recreating the same sprawl of siloed data, disjointed applications, and us-versus-them processes. Providing a single source of contextual truth and governance insight, metadata, centralized in the data catalog, may be the single, invaluable key to help organizations make a clean break with a messy past.
For a look at the state of the market around metadata solutions, check out the 2018 Gartner Magic Quadrant for Metadata Solutions. Find out why Gartner has positioned Informatica at the furthest on the axis for completeness of vision, and highest for ability to execute.