Big data is different than small data
The definitions of big data are diverse. Many authors define big data by its characteristics–volume, velocity, and variety of data. VC Choudary, Associate Professor of Information Systems at UCI, for example, says “what differentiates Big Data from traditional data is the sheer volume of information, velocity at which it is created, and the variety of sources from which it is drawn?” Hurwitz and Associates—a BI consultancy—defines Big Data similarly as the capability to manage a huge volume of disparate data, at the right speed, and within the right time frame to allow real time analysis and reaction.
But how about the business practitioner’s point of view? Recently, I heard a significant Healthcare CIO talk about Big Data. This CIO defined Big Data by defining what is “small data” first. He said small data is “single-source, often batch-processed, and locally managed”. So what then is Big Data? “It is multi-source, requires connecting between data sources, multi-structured (structured and unstructured), real time, and uses information in aggregate”. This healthcare CIO went onto say that he sees “Big Data aiming to establish a model from the data. Big Data is about finding data relationships in the data rather than creating the data relationships in a data model”. This is a huge difference from traditional business intelligence (BI), which is best implemented when there is a level of determinism for the data in the data model.
Parallel architectures enable Big Data
Truly parallel architectures are an enabler of Big Data. To be fair, parallel architectures are not truly new—parallel architectures have existing for some time. I remember seeing my first server based parallel architecture in the work that Intel and other chipset makers were doing back in the mid 90s. And to be really fair Von Neumann defined parallel processing and serial processing architectures at the same time. What is new is that over the last few years is that we have lost a degree of parallelism as we have sought to centralize and protect data. What Hadoop does is to gang bang together many cheap machines at the same time as it spreads the data and spreads the processing. Redundancy is achieved by sharing each processing load with more than one machine.
Big Data moves from descriptive statistics to predictive analytics
According to Choudary, Big Data is not just about the amount of data that can be processed. It also about what you can do with data. He claims that “Big Data is about changing the game of business from one of simple descriptive statistics into one where all of the available data is collected and mined together. The Big Data era is about predicting outcomes based on disparate pieces of information and therefore, it is about prescribing opportunities.”
Real Life Big Data Case Studies
So what is big data good for?
Let’s start with what has already been learned in healthcare big data analysis. In healthcare, they have found that people with higher pain scores crash more often in the ICU. Scary to me that they are just learning this! Another big issue in healthcare is re-admits. And Healthcare reform creates big penalties for them. To help limit them, it is really important that know that patients manage their illness after they leave the hospitals. What they have learned from studying patient credit scores, is that they are a good predictor of whether patients will take their medicines and therefore, have a tendency to be readmitted to the hospital. The higher the credit score, the higher the probability is that people will take their medicine after leaving the hospital. I found this particularly interesting, because several years ago, I got to work with Intuit. They had identified a persona for those that were meticulous with their finances. They called them anal-retentives. Big Data has determined that anal-retentives take their medicines more often. So hospitals should check-in more regularly on those with poor credit scores to make sure that they are taking their medicines and thus, limit their re-admits.
The Healthcare CIO that we mentioned earlier claims that Big Data will over time move from “differentiating healthcare organizations to table stakes.” When I asked him why, he said the reason is simple: “We are in the business of creating the highest value care. And big data is fundamentally about serving our patients better than we do today. And everyone in healthcare will have to do it.” Another Healthcare CIO says that he is looking to Big Data to help him create a greater understanding of the relationship of inputs to outputs concerning patients. We need to have a better understanding of the health status and needs of a specific patient over time. This means assembling data from multiple patient encounters and multiple sources. He goes on to say that “those organizations with strong partnerships up and down the value chain or for that matter, even among competitors, are better positioned to take insights, process improvements, and other advantages to the market. Use and management of data and will increasingly become an element of competitive advantage”.
Big Data helps with Credit Risk
It is not just healthcare that sees big changes being enabled by Big Data, the president of a major credit reporting agency sees Big Data as enabler of risk reduction for his firms. He asserts as well that on average firms use less than 5% of the data available to them—this is important in financial markets where the quality of risk management can to determine the earnings returned to shareholders. He says, however, he sees a challenge in big data is hiring the talented people who can ask the right questions of data. As in many growth industries, hiring talented practitioners and data scientist is a difficult thing to do.
Big Data makes Fast Food Better
Meanwhile, a major fast food vendor says that big data has enabled them to better understand their market from the outside in and across many disciplines including public relations, customer service, marketing, advertising, research, product innovation, and sales. Clearly, creating a view across all these touch points can lead to better decision making.
What does it mean?
So there you have it. Big Data is big in terms of what it involves and what it is trying to accomplish. It has already had derived interesting outcomes for healthcare, credit reporting, and even fast food. The question is what is doing or can do for your business. Please feel share to your results here.