AWS re:Invent: A Data Management View

That was a very interesting and exciting AWS re:Invent conference last week in Las Vegas! The pace of innovation was high as Amazon announced two dozen new services. The most interesting part for me was in the keynote on the second day where Werner Vogels, CTO of Amazon, described their vision for a data architecture.

First, Werner opened up the section by talking about data as a competitive advantage. For regular readers of this blog, you will know that this was music to my ears. His point was that as more and more companies build their applications from the ever-expanding number of services that AWS offers, their applications will become more and more standardized because they are all using the same services. How then do they differentiate themselves? The answer is data.




Then Werner went on to describe the AWS data architecture vision. It was pretty comprehensive, including everything from data ingestion, to data quality, and data prep for analytics. The bottom line: I am happy to see Amazon taking a focus on data. The AWS data architecture was focused on their design-center; AWS platform developers. And they have a very comprehensive vision for that audience.




The bigger problem for larger organizations will be enterprise-wide data management, which was not addressed here. The pace of innovation in the cloud and on the AWS platform specifically is very attractive to developers. But, for the foreseeable future, the majority of large organizations will be trying to manage data as their competitive advantage and primary differentiator across multiple environments, such as:

  • Traditional on-premise data sources
  • Third party data sources – which are far less controllable in terms of data quality and the business metadata attached
  • Big data and big data analytics
  • And, of course, cloud environments. And that is deliberately plural.

Most large enterprises will be dealing with multiple cloud platforms. They may have platforms for the various cloud applications and analytics they are running. These applications may be on-premise applications that are hosted in the cloud or they may be native, managed, cloud applications. They may also be building new, custom applications, composed of services provided by the cloud platform (like AWS).

The point here is that modern data management is going to get a lot more complex in the near term. A good data management architecture needs to work across on-premise, big data, and probably multiple different cloud platforms and applications. The mix of these will undoubtedly change over time as new technologies emerge and the organization’s data center of gravity moves to the cloud.

The organizations that will prosper in this environment will be those to design a data management architecture that can handle all of these different use case and evolve with the needs of the organization. And, it will be critical to increase productivity and self-service of data delivery to fuel new applications and analytics initiatives. Finally, it will also be important to have an end-to-end view of the data capabilities that make up an enterprise data architecture such as; data integration, data quality, application integration, business process integration, master data management, B2B data exchange and more.

Informatica was called out several times as a key partner of Amazon during the two days of keynotes. We look forward to a continued and deeper partnership with Amazon and the Amazon community for data management.