Big Data takes a lot of forms and shapes, and flows in from all over the place – from the Internet, from devices, from machines, and even from cars. In all the data being generated are valuable nuggets of information.
The challenge is being able to find the right data needed, and being able to employ that data to solve a business challenge. What types of data are worthwhile for organizations to capture?
In his new book, Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams With Advanced Analytics, Bill Franks provides an wide array of examples of the types of data that can best meet the needs of business today. Franks, chief analytics officer with Teradata, points out that his list is not exhaustive, as there is almost an unlimited number of sources that will only keep growing as users discover new ways to apply the data.
Here are the fundamental data sources now part of the Big data analytics picture:
- Text data: Text comes from emails, text messages, tweets, social media postings, instant messages, real-time chats and audio recordings that have been translated into text. The primary application is sentiment analysis, which “looks at the general direction of opinion across a large number of people to provide information on what the market is saying, thinking and feeling about an organization,” says Franks. Additional applications are fraud detection and pattern recognition. Text data will be “one of the most widely used forms of Big Data,” he adds.
- Time and location data: More and more data is available through global positioning systems (GPS), personal GPS devices, and cellular phones. “As an organization collects time and location data on individual people and assets, it starts to get into the realm of Big Data quickly,” says Franks. “This is especially true if frequent updates to that information are made.” He adds that an emerging application employing time and location data is the “generation of offers for customers that are only good for a specific time period and a specific location. Such offers can be far more powerful and targeted than others for a broad range of time and locations.”
- Telematics data: Insurance companies currently are making the most use of telematics, which involves placing a sensor in a customer’s car to capture information on his or her driving habits. “Telematics data helps insurance companies better understand customer risk levels and set insurance rates,” says Franks. But there’s far more potential to this technology, he adds. Looking beyond insurance applications, telematics-related information can help measure wear and tear on cars and determine traffic patterns.
- Radio frequency identification data: RFID is employed in inventory tracking and automated toll tags, and potential uses being explored include store tracking via RFID readers placed in shopping carts, identifying fraud. “Today, a passive RFID tag costs just a few cents and prices continue to drop,” says Franks. “As prices continue to drop, the feasible uses will continue to expand.” He also adds that “with RFID data, like many other Big Data sources, the power just isn’t in what RFID data can tell you uniquely by itself. It is in what it can tell you when combined with other data.”
- Sensor data: An application for sensor data may be within engines, either on production floors or in vehicles, says Franks. “By capturing and analyzing detailed data on engine operations, it’s possible to pinpoint specific patterns that lead to imminent failures. Patterns over time lead to lower engine life or more frequent repair can also be identified. Some examples of the types of questions that can be studied include: “Does a sudden drop in pressure indicate imminent failure with near certainty? Does a steady decrease in temperature over a period of a few hours point to other problems?”
- Social network data: This is a high-impact area, and already is playing a role in changing the way organizations value customers, says Franks. “Instead of solely looking at a customer’s individual value, it is now possible to explore the value of his or her overall network. This can lead to drastically different decisions about how to invest in that customer. A highly influential customer needs to be coddled well beyond what his or her direct value indicates if maximizing a network’s total profitability takes priority over maximizing each account’s individual profitability,”
- Web data: Web data can reveal such important nuggets such as shopping behaviors, customer purchase paths and preferences, research behaviors, feedback behaviors, what to provide for “the next best offer,” attrition modeling, response modeling, customer segmentation, and advertising results. “Tremendous value can be generated from analysis of faceless customers who are identified only by an arbitrary identification number,” Franks points out. “This way, neither analysis, nor anyone else, can identify who each customer actually is. Only the patterns matter.”
- Telemetry data: The gaming industry currently benefits the most from this type of Big Data, culled from controller or keyboard actions. “By mining players’ playing patterns, insights can be obtained into what types of game behavior are associated with renewals and which are not.”
- Smart-grid data: This type of data, now available from smart home meters and devices, will finally help the utility industry catch up to the 21st century. “Utilities are already aggressively moving to pricing models that vary by time of day and demand, and the smart gird will only accelerate that,” he says. “In effect, power companies will have the ability to do all of the customer analytics other industries have been able to do for years.”