How to Make Great Data Work? Data Architectures for Agility and Compliance

How to Make Great Data work? Data Architectures for Agility and Compliance

Finally organisations have come to the realisation that data is one of the essential differentiating pillars to survive and thrive in today’s competitive environment. For a long time, data was part of or hidden in applications. However, with the advent of social media and mobile apps, the reduction in the cost of computing, increasing demands from the business and the decreasing cost of storage, the face of IT has changed forever.   As organisations strive for more agility and speed, they recognise that data as arguably their most valuable asset.  Many organisations have established the role of a CDO (Chief Data Officer), which is a testament to how important data has become.

But how can organisations get the best value out of all the data available whilst retaining  flexibility and scalability to enable business agility and innovation? And how can this remain affordable and cost controlled? The answers to these questions are, in my opinion twofold: first, a top down enforced data governance policy should be defined, so that definitions, ownership, compliance rules, retention periods, etc. are agreed upon across the organization. Secondly, a reliable and scalable data platform needs to be in place, not only to manage all of the data logistics between the various systems, but also to enforce policies in order to ensure that all data is available in a secure and trustworthy way. With a robust data platform in place, preferably combined with a solid Master Data infrastructure, any new data dependent business initiative can be executed quickly and affordably.

Depending on the maturity and the size of the organization, different patterns can be used for implementing a solution architecture. Let’s have a closer look at how organizations have achieved this.

Firstly, most companies have gone through various stages of analytics from ad hoc reporting,  to data marts and to enterprise data warehouses (EDW). Mature organizations still have this in place for descriptive analytical use cases (performance indicators, analysing what happened and figuring out why, etc.) As these EDW implementations have been mired in very inflexible IT processes, business users have always been looking for more agile alternatives. With the tremendous growth of of data available (social media, logs, devices) the need for agility has never been greater ; business and IT are exploring ways to make this possible. New paradigms are popping up like Data Hubs, Data Lakes, Data Reservoirs etc.  Conceptually, however, there is not a vast difference between all of these scenarios. Data Hubs are implemented to give business data owners the opportunity to “publish” data sets that can be consumed by other departments or “subscribers”. The hub architecture eliminates most of the point-to-point interfaces and decouples data-sources from its destinations. This concept is not only used for hub-and-spoke data warehouse projects, but also for application to application data exchange and in application consolidation or migration projects.

Another architecture that is gaining popularity is the concept of a Data Lake: All systems and applications store a copy of their (relative) data, structured as well as semi-structured, in a big “lake” of data, which most of the time utilizes the Hadoop software platform. This data lake architecture enables organizations to remove the data silos and have all data at hand to perform analytics or research, for example. Creating a data lake is not a huge problem, but the challenge lies in extracting the value from it! As a result, it is even more important to manage semantic metadata and enrich the data by implementing an Intelligence layer on top of the data lake. In conclusion, I believe that different data architecture-patterns (DWH, Hubs, Lakes) can be chosen for specific business purposes. However, I also believe that, as change is the only certainty in the future, a solid Data Platform needs to be in place, to deliver:

  • Access to all relevant data, on premise or in the cloud
  • Capability to master and govern all enterprise data
  • Agility for Business users for new initiatives
  • Scalability for growth in volume and increase of speed
  • Security and masking capabilities to stay or become compliant with regulations.
  • Business Users as much self-service and collaboration functionality as possible, to democratize the use of enterprise data assets.

For more information on data architectures, please access the link below: