There is Just One V in Big Data
According to Gartner, 64% of organizations surveyed have purchased or were planning to invest in Big Data systems. More and more companies are diving into their data, trying to put it to use to minimize customer churn, analyze financial risk, and improve the customer experience.
Of that 64%, 30% have already invested in Big Data technology, 19% plan to invest within the next year, and another 15% plan to invest within two years. Less than 8% of Gartner’s 720 respondents, however, have actually deployed Big Data technology. This is bad, because most companies simply don’t know what they’re doing when it comes to Big Data.
Over the years, we have heard that Big Data is Volume, Velocity, and Variety. I feel this is one of the reasons why despite the Big Data hype, most companies are still stuck in neutral is because of this limited view.
- Volume: Terabytes to Exabytes, petabytes to Zetabytes of lots of data
- Velocity: Streaming data, milliseconds to seconds, how fast data is produced, and how fast the data must be processed to meet the need or demand
- Variety: Structured, unstructured, text, multimedia, video, audio, sensor data, meter data, html, text, e-mails, etc.
For us, the focus is on collection of data. After all, we are prone to be hoarders. Wired by our survival extinct to collect and hoard for the leaner winter months that may come. So while we hoard data, as much as we can, for the illusive “What if?” scenario. “Maybe this will be useful someday.” It’s this stockpiling of Big Data without application that makes it useless.
While Volume, Velocity, and Variety are focused on collection of data, Gartner, in 2014, introduced 3 additional Vs: Veracity, Variability, and Value which focus on usefulness of the data.
- Veracity: Uncertainty due to data inconsistency and incompleteness, ambiguities, latency, deception, model approximations, accuracy, quality, truthfulness or trustworthiness
- Variability: The differing ways in which the data may be interpreted, different questions require different interpretations
- Value: Data for co-creation and deep learning
I believe that perfecting as few as 5% of the relevant variables will get a business 95% of the same benefit. The trick is identifying that viable 5%, and extracting meaningful information from it. In other words, “Value” is the long pole in the tent.