The Great-Data Divide

Data Preparation, Visualization, Predictive etc. Here

Over the past few weeks, I have been combing through statistics on the Return on Investment (ROI) and benefits people get from analytics. Just to be clear, I include the broad category of data preparation, visualization, predictive etc. into my definition of “analyticshere. In the process, I am building out a pretty stark contrast between the “have’s” and the “have not’s”–those who succeed and those who follow.


Our focus on analytics and business intelligence (BI) are not a new phenomenon by any means. In fact, it seems BI has been on the Gartner Top Ten List for CIOs since 2004. And it has been near the top of that list for most of that time. For the past 12 years it has been a major focus area, yet, we still have such vast disparity.  Consider other issues that became prominent, got focused on, got solved, and dropped down or off the list. Think where virtualization was and where it is now.

 Current State of Analytics

A couple of statistics

  • In a 2013, an Economist Intelligence Unit report, The Data Directive, found that 97% of executives agreed that data is strategic, yet only 12% believed they were actually effective at doing so.
  • The BI software market was already $14Bn in 2013 and is expected to grow to $20Bn by 2018.

Add services to this and it is likely to be 3-4 times higher. And then add hardware and we are in the +$100Bn range.

  • Despite being our top CIO focus and spending aggressively, BI Scorecard’s 2014 survey suggest only a 21% penetration into organizations. It has consistently kept in the 18%-24% range since 2005.

But there is light at the end of the tunnel in that some folks are beginning to get it right. A 2013 Aberdeen report showed Leaders get an ROI from BI projects on average in 6.7 months compared to the Followers doing it in 16.3 months.  This means a 240% better ROI than the Followers.

A Shift Is Happening

In their latest BI Magic Quadrant, Gartner suggested there is a major shift to self-service BI and even go as far as suggesting that by 2017 most business users and analysts will have access to these tools — a massive jump from our 20% range of a year ago. But it comes with even more danger. The same report suggested that through 2016, less than 10% of self-service BI initiatives will be governed sufficiently to prevent inconsistencies that adversely affect the business – i.e. 90% will fail due to inconsistencies.

At the heart of the issue is that we can visualize as much as we want, but if it is bad data, it will simply result in bad decisions. But since we never systematically design for great-data that is continuously clean, safe and connected, every project becomes a new work of art. There are no economies of scale or organization-wide reuse and learning.  The same Data Directive report mentioned earlier also stated that only some 15% of executives felt they were better than their competitors at making use of data. And there-in lies the risk – 85% of them thought competitors were better!

Three Principles For Success

Design for great-data – data that is clean, safe and connected. You have to find a way to have more continuous flows of data of the right level of trust and not have to recreate this from scratch every time a new initiative starts. Up to 80% of analyst time is spent on data prep. Standardizing on data platforms can be a massive help, especially if the BI tools landscape will increasingly be based on personal choice and preference.

Focus on the data about the data.  If we are going to democratize data to the masses then we need the metadata to help guide and shape that experience–making sure folks don’t hurt themselves in the process of doing self-service BI. In fact, if we have a converged data platform as per our first point, then the metadata from that platform can guide and recommend ways the analyst can better shape and prepare their own data.  Think Amazon like shopping for data.

Reduce the complexity of introducing new technologies. New technologies will come and replace or augment a previous generation. Our current fascination with Big Data is just an example. What we call big today will be laughed at in five years time. How do we get our platforms and our mappings and rules to be designed once and deployed again and again on a new technology? Our approach to new technologies becomes a self-inflicted disruption if we always have to start from scratch. It does not mean you don’t leverage the new, but simply picking a stand-alone solution for the sake of a new technology will not help us scale or do things faster.

Parting thoughts

I have been involved in the BI and analytics space for the best part of 30 years and have always believed in the power of insight and enlightenment. But it has always been too expensive or too difficult to really deploy this to the masses, which is why it has stayed the privilege of the few.  Jevon’s Paradox applies to our domain too – if we can lower the cost of doing BI and analytics it will allow us to do more of it and ultimately will lead to more innovation and better outcomes. It is not about the few massive decisions but rather experimentation and rapid iteration.

Read more about designing for great data.