In recent times, the big Internet companies – the Googles, Yahoos and eBays – have proven that it is possible to build a sustainable business on data analytics, in which corporate decisions and actions are being seamlessly guided via an analytics culture, based on data, measurement and quantifiable results. Now, two of the top data analytics thinkers say we are reaching a point that non-tech, non-Internet companies are on their way to becoming analytics-driven organizations in a similar vein, as part of an emerging data economy.
In a report written for the International Institute for Analytics, Thomas Davenport and Jill Dyché divulge the results of their interviews with 20 large organizations, in which they find big data analytics to be well integrated into the decision-making cycle. “Large organizations across industries are joining the data economy,” they observe. “They are not keeping traditional analytics and big data separate, but are combining them to form a new synthesis.”
Davenport and Dyché call this new state of management “Analytics 3.0, ” in which the concept and practices of competing on analytics are no longer confined to data management and IT departments or quants – analytics is embedded into all key organizational processes. That means major, transformative effects for organizations. “There is little doubt that analytics can transform organizations, and the firms that lead the 3.0 charge will seize the most value,” they write.
Analytics 3.0 is the current of three distinct phases in the way data analytics has been applied to business decision making, Davenport and Dyché say. The first two “eras” looked like this:
- Analytics 1.0, prevalent between 1954 and 2009, was based on relatively small and structured data sources from internal corporate sources.
- Analytics 2.0, which arose between 2005 and 2012, saw the rise of the big Web companies – the Googles and Yahoos and eBays – which were leveraging big data stores and employing prescriptive analytics to target customers and shape offerings. This time span was also shaped by a growing interest in competing on analytics, in which data was applied to strategic business decision-making. “However, large companies often confined their analytical efforts to basic information domains like customer or product, that were highly-structured and rarely integrated with other data,” the authors write.
- In the Analytics 3.0 era, analytical efforts are being integrated with other data types, across enterprises.
This emerging environment “combines the best of 1.0 and 2.0—a blend of big data and traditional analytics that yields insights and offerings with speed and impact,” Davenport and Dyché say. The key trait of Analytics 3.0 “is that not only online firms, but virtually any type of firm in any industry, can participate in the data-driven economy. Banks, industrial manufacturers, health care providers, retailers—any company in any industry that is willing to exploit the possibilities—can all develop data-based offerings for customers, as well as supporting internal decisions with big data.”
Davenport and Dyché describe how one major trucking and transportation company has been able to implement low-cost sensors for its trucks, trailers and intermodal containers, which “monitor location, driving behaviors, fuel levels and whether a trailer/container is loaded or empty. The quality of the optimized decisions [the company] makes with the sensor data – dispatching of trucks and containers, for example – is improving substantially, and the company’s use of prescriptive analytics is changing job roles and relationships.”
New technologies and methods are helping enterprises enter the Analytics 3.0 realm, including “a variety of hardware/software architectures, including clustered parallel servers using Hadoop/MapReduce, in-memory analytics, and in-database processing,” the authors adds. “All of these technologies are considerably faster than previous generations of technology for data management and analysis. Analyses that might have taken hours or days in the past can be done in seconds.”
In addition, another key characteristic of big data analytics-driven enterprises is the ability to fail fast – to deliver, with great frequency, partial outputs to project stakeholders. With the rise of new ‘agile’ analytical methods and machine learning techniques, organizations are capable of delivering “insights at a much faster rate,” and provide for “an ongoing sense of urgency.”
Perhaps most importantly, big data and analytics are integrated and embedded into corporate processes across the board. “Models in Analytics 3.0 are often being embedded into operational and decision processes, dramatically increasing their speed and impact,” Davenport and Dyché state. “Some are embedded into fully automated systems based on scoring algorithms or analytics-based rules. Some are built into consumer-oriented products and features. In any case, embedding the analytics into systems and processes not only means greater speed, but also makes it more difficult for decision-makers to avoid using analytics—usually a good thing.”
The report is available here.
With the growing prominence of big data as both a strategic and tactical resource for enterprises, there’s been a growing shift in the scope of business intelligence. Not too long ago, BI’s world was in tools that ran on individual workstations or PCs, providing filtered reports on limited sets of data, or stacking the data into analytical cubes.
Now, BI encompasses a range of data and analytics from across the enterprise, and is increasingly likely to be online, supported in the cloud, as it is in a local PC. However, as it has been for years, BI adoption still tends to be limited, not reaching its full potential. In a recent interview, BI analyst Cindi Howson, asks the question, what’s holding companies back from achieving a big impact with BI? In a recent Q&A with TDWI’s Linda Briggs, she discussed the issues raised in her new book, Successful Business Intelligence: Unlock the Value of BI and Big Data.
The success of BI depends, more than anything, on one factor, she says: corporate culture. Some organizations have achieved an analytic culture that reaches across their various business limes, but for many, it’s a challenge. “Leadership means not just the CIO but also the CEO, the lines of business, the COO, and the VP of marketing,” says Howson. “Culture and leadership are closely related, and it’s hard to separate one from the other.”
While corporate culture has always been important to success, it take on even a more critical role in efforts to compete on analytics. For example, she illustrates, “companies have a lot of data, and certainly they value the data, but there is sometimes a fear of sharing it. Once you start exposing the data, somebody’s job might be on the line, or it can show that someone made some bad decisions. Maybe the data will reveal that you’ve spent millions of dollars and you’re not really getting the returns that you thought you would in pursuing a particular market segment or product.”
It’s important to see an analytics culture as focusing on data as a tool to see problems and make course corrections, or act on opportunities – not to punish or expose individuals or departments.
Another point of corporate resistance is employing BI in the cloud, a challenge recently explored by Brad Peters, CEO of Birst. Here again, corporate culture may hold back efforts to move to the cloud, which offers greater scalability and availability for BI and analytics initiatives. In a recent interview in Diginomica, he says that IT departments, for example, may throw up roadblocks, for fear of being disintermediated. Plus, there is also a recognition that once BI data is in the cloud, it often gets “harder to work with.” Multi-tenant sites, for example, have security systems and protocols that may limit users’ ability to manipulate or parse the data.
The increasing adoption of cloud-based services – such as those from Amazon or Salesforce – are gradually melting resistance to the idea of cloud-based BI, Peters adds. He particular;y sees advantages for geographically-dispersed workforces.”
For his part, he admits that “has never been under any illusion that the shift of enterprise analytics to the cloud was going to happen overnight.”
Big Data means many things to many people – it all depends on their place and perspective in the organization. But there is something for everyone.
I recently explored the advantages being seen across the enterprise in a recent special report in Database Trends & Applications, and have distilled the key points below:
For data managers, it’s all about choice. The rise of the Big Data environment has brought with it a new generation of solutions, including open source, NoSQL and NewSQL databases – not to mention Apache Hadoop and cloud-based data environments. Big Data is extremely accessible now because of low-cost solutions to capturing and analyzing unstructured forms of data that haven’t been available until recently. Consider all the sensor data – from RFID tags, from machines – that’s been floating around for the past decade. Previously, capturing and managing such data was never cheap. Now with more more inexpensive databases and tools such as Hadoop, such data is now within the realm of the smallest organizations. In addition, cloud provides almost unlimited capacity, and can support and provide big data analytics in a way that is prohibitive for most organizations.
For data scientists, analysts and quants in organizations, it’s all about capabilities. The new Big Data world is all about diving deep into datasets and being able to engage in storytelling as a way to make data come alive for the business. Open source plays a key role, through frameworks such as Hadoop and MapReduce. There is also the highly versatile R language, which is well-suited for building analytics against large and highly diverse data sets. Predictive analytics is also is another key capability made real by Big Data.
For business users across the enterprise, it’s all about collaboration. There has been a growing movement to open up analytics across the organization – pushing Big Data analysis down to all levels of decision-makers, from front-line customer service representatives to information workers. New capabilities such as cloud services, visualization and self-service enable end users without statistical training to build their own queries and draw at their own conclusions. Along with user-friendly interfaces to Big Data, there’s been a rise in pervasive BI and analytics running in the background, embedded within applications or devices, in which the end-user is oblivious to the software and data sources feeding the applications. Cloud opens up business intelligence and analytics to more users. In addition, more organizations are focusing on providing Big Data analytics through apps on mobile devices, accelerating the move toward simplified access.
For the members of the executive suite, it’s all about competitiveness. Most executives grasp the power that big data can bring to their operations, especially with performance analytics, predictive analytics and customer analytics. Employing these analytics against Big Data means better understanding customers and markets, as well as becoming aware of trends that are still bubbling beneath the surface.
The greatest challenge to big data management and analysis isn’t necessarily the technical underpinnings, but rather, lingering executive confusion and uncertainty about what it is and what it can do for their organizations.
The main issue – and root of executive befuddlement – is the abject and ongoing confusion about what, exactly, is meant by “big data.” It’s certainly a hyped-up term for something that has been around for a long time. If you had a one-terabyte database around the turn of the century, you had big data, that’s for sure. If you had a 500-megabyte database in 1990, that would have been big data.
So the “volume” has always been there, and has always been a relative measure. The same goes for the “variety” aspect of big data. Unstructured data – such as word documents or machine log data – has been floating around organizations for decades now. How about the “velocity”? Real-time processing has been on corporate radar screens for well over a decade.
So, what’s changed that we suddenly see this data as an enabler, a game-changer, opening up the gates to a brave new world of analtyics-driven purpose? The rise of relatively cheap open-source tools and platforms for one. Capturing and analyzing large volumes of fast-moving data of various structures required very expensive equipment and consulting assistance. The expensive consultants may still be needed, but the technology is within reach for many organizations.
With this in mind, it is interesting to see that business leaders are warming up to the possibilities of big data, as a new industry survey shows. But what it is exactly they think they’re warming up to is still a big question mark. The survey, conducted among 500 business and IT executives by CompTIA, shows the big data phenomenon has caught the eyes of executives. The vast majority of organizations, 78%, say they feel more positive about big data as a business initiative this year compared to a similar survey conducted a year ago. And, remarkably, 57% feel they’ve made progress in moving in the right direction with data-driven programs, compared with 37% the year before.
To its credit, the CompTIA study’s authors question how accurately these findings actually translate to progress on the big data front: They note that while this years’ survey finds 42% of respondents claiming to be engaged in some of big data initiative – more than double from a year ago (19%) – such initiatives may be “big data” in name only. “This may stem from confusion or reflect the possibility of different users interpreting the concept of big data in different ways,” they observe.
So what we have is a lot of organizations diving into what they see as “big data” projects because that’s what everybody tells them they should be doing. But how much of this is simply the same types of data management and analytics projects that may have been engaged five, 10 years ago?
To really be making the most of big data as we understand it today, organizations should be addressing the following questions:
How much unstructured data is coursing through the organization, and how much of it is worth harvesting? It’s usually easy to measure the amount of structured data, such as that stored in relational databases or data warehouses, but unstructured data is a huge question mark. In many cases, management is clueless about what types of unstructured assets (user-generated files, machine-generated data) are actually available. It’s going to take a lot of research and discovery to uncover the unstructured data assets that are truly meaningful for the business.
Does the current data architecture support the introduction and integration of data sources? Most traditional data architectures are fairly rigid, built to support the inputs and outputs of relational data. Efforts involving other forms of data tend to be one-off projects, in which connectors or interfaces are hand-built built for a single purpose and then forgotten. Reaching out and exploring new and varied types of data require an architecture in which new sources can be rapidly and seamlessly introduced, without the usual silos.
Is the organization moving to an analytics culture? Big data analytics will never be “big” if it only is available to a few select decision makers or analysts. Big data will pack its punch when it enables decision makers at all levels of the organization – from customer care centers to production floors to the executive suite – to access analytics from various data sources. Even more helpful would be a way in which decision-makers can access analytical tools and back-end data sources through self-service approaches.
“Raw data is both an oxymoron and a bad idea. On the contrary, data should be cooked with care.” This was a statement made by Geoff Bowker in 2005, and served as the opening lines of a recent talk by Kate Crawford, principal researcher at Microsoft Research New England and a visiting MIT professor, who urged that big data be adopted and handled cautiously.
In her keynote at the recent DataEDGE 2013 conference, held at the University of California at Berkeley, Crawford said the time is now to have a discussion on the implications big data is having on business and society.
She outlined the six myths that have arisen around big data:
Myth #1: Big data is new. References to big data began to pop up in the literature in the late 1990s, but this is something some prominent industries, such as financial services firms and oil companies, have been wrestling with for decades, Crawford says. What is new, however, “is the fact that a lot of the tools of big data are becoming more easily reached by a lot more people. We’re having an explosion in ideas, creativity and imagination in terms of what we can do with these technologies.” This is the time to discuss the implications of big data, she adds, because much of it will be invisible within a few years as the tools and technologies mature. “Really usable systems and really good technologies disappear,” she states. “The easier they are to use, the harder they are to see.”
Myth #2: Big data is objective. Actually, big data sets can be very biased, Crawford states. For example, she says, she poured through 20 million tweets sent out about Hurricane Sandy, which flooded her neighborhood in Manhattan last year. While the tweets tell a compelling story about how residents coped, they mainly represent the views of younger, more well-to Manhattan residents. “If we look a little closer at the tweets, most were coming out of Manhattan, which has a higher concentration of people using smartphone, and a higher concentration of Twitter users – a subset of a subset. There were very few tweets coming from the far more affected areas, such as Breezy Point or the Far Rockaways. Because we don’t have the data from those places, we essentially have very privileged urban stories. We have to be really clear who were talking about, we have to think about what this data really represents,” she says.
Myth #3: Big data doesn’t discriminate. “There’s a myth that says essentially because you’re dealing with large data sets, you can somehow avoid group-level prejudice,” Crawford cautions. She pointed to a recent study of the Facebook “likes” of 60,000 people that found such data can be used to identify a person’s race, sexual orientation, religious views, political leanings, and even if they are a previous drug or alcohol user. “The researchers also expressed a set of concerns that this data can be bought by anyone. Ultimately, employees can make decisions about individuals based on this data.”
Myth #4: Big data makes cities smarter. While big data goes a long way to improve the management of city problems, it also may under-represent communities. “Not all data is created or collected equally – there are always certain communities of people who are going to be left out of those data sets,” Crawford says. For example, last year, the city of Boston released an app called StreetBump, which automatically registered potholes by passively collecting GPS data from drivers’ smartphones. The program collected a great deal of data on potholes. However, she adds, “wealthier younger citizens are more likely to have smartphones, and therefore, wealthy areas with younger people would get more attention, while areas with older residents with less money will get fewer resources.”
Myth #5: Big data is anonymous. Crawford cited a recent study, published in Nature, which determined that individuals could be identified with no more than four data points, including their cell phone number. Before the advent of personal technology, it took about 12 data points to identify an individual. “It’s very difficult to make data anonymous – even with two randomized data points, it’s possible to identify 50 percent of people.” Another big data initiative, the smart grid being adopted by electric utilities, will capture a wealth of data – from energy usage to “when you have friends over, when you are sleeping. This is some very intimate data.”
Myth #6: You can opt out of big data. There are suggestions that people will be able to protect their privacy is they pay a fee for web services to opt out of tracking, versus using services for free in exchange for giving up some information. Crawford cautions that this will result in a two-tier system, which “turns private data into a luxury good rather than a public good.”
Rather than making data privacy and management an individual choice, Crawford urges a more public discussion on “the way that the data is essentially flowing between corporations, individuals and governments.”
The drive to achieve competitive advantage with Big Data is creating a lot of interesting opportunities for managers and professionals working in the data analytics space. In some cases, new job categories not imaginable just a few years back are being created, and are in demand. “Data scientist” is only one of these descriptions, and there are jobs that don’t require Ph.Ds in statistics. Sometimes, it just takes a little creative thinking to move one’s career in a new and different direction.
In a recent Forbes post, J. Maureen Henderson, head of a market research firm, discussed the ways emerging Big Data expertise is being leveraged for business problems. One University of Tennessee student, for example, is pursuing a post-graduate degree in Big Data analytics to “tell stories from data,” noting that in a previous job, she saw that “there were plenty of talented ‘data’ people and plenty of the talented ‘business’ people; however, the people who could do both were extraordinarily valuable to the firm and to my team’s ability to solve problems. That really got my wheels turning, and I started thinking about what other problems I might be able to solve if I knew more about analytics.”
In the process of exploring the avenues by which big data will deliver value to businesses, some interesting new job titles and descriptions are emerging across the industry. The new generation of jobs being spurred by Big Data are often a blend of stats-savvy and business-savvy skillsets and activities.
Here is a sampling of a few of these blended positions that have recently appeared at online recruiting sites:
Industry analytics manager (pharmaceutical): “Collaborate with cross-functional partners in industry analytics, market analysis and strategy, the managed care contracting organization and brand marketing teams to consult with and deliver deep insights and actionable strategic and tactical recommendations on access and reimbursement drivers of the business. Demonstrate ability to break complex problems down into distinct parts, simplify complexity, and manage uncertainty.”
Data scientist/machine learning expert (online consumer site): “Data science team is looking for a data scientist to work on machine learning, data mining and information retrieval problems. Perform complex analysis on very large data sets using data mining, machine learning and graph analysis algorithms. Build complex predictive models that scale to petabytes of data. Define metrics, understand A/B testing and statistical measurement of model quality. Work closely with product, engineering, and marketing teams to identify, collect and analyze data to answer critical product questions.”
Data anthropologist (marketing firm): “Leverage huge data set and powerful analytical tools to give the public more insight into the digital world. Keep up-to-the-minute with current events and understand the online ecosystem—publishers, consumers, advertisers, analysts, bloggers, vendors, journalists and others who make the internet buzz. Craft stories to create buzz, backed by our data, and share them as tweets or blog posts or press releases or white papers or whatever best suits the material.”
Data scientist/data lover (scholarship fund): “Define and implement the social media measurement strategies and business intelligence analytics that align with marketing and business objectives; perform qualitative, statistical and quantitative analysis; producing meaningful marketing KPI dashboards and delivering routine and ad hoc, cross-channel performance reports with actionable insight. The candidate should be able to identify correlations in cause and effect of email and online/MR and social media integrated campaigns resulting in increased individual donations and stakeholder engagement.”
Systems engineer – big data (game publisher): “Players continue to rack up billions of hours of play — all of it logged, all of the logs frankly rather useless until our lab-coated Big Data scientists work their black magics, transmogrifying unwieldy petabytes through the careful application of open-source and proprietary technologies and bucketloads of intellectual elbow grease. As Systems Engineer – Big Data you’ll provide ongoing support for data warehouse and data services infrastructure and systems, ensuring Big Data van keeps rolling, come hell, high water or technical difficulties. Your exceptional communication skills will help you as you smooth the transition from raw data to actionable insight about the players.”
Data visualization engineer, streaming platforms (streaming video provider): “Own and build new, high-impact visualizations in our insight tools to make data both understandable and actionable. Develop rich interactive graphics, data visualizations of large amount of structured data, preferably in-browser. To deliver operational insights quickly and effectively, we need an excellent suite of interactive tools and dynamic dashboards. Our goal is to raise our operational insight capabilities to a whole new level of excellence that enables us to continuously improve our product while ensuring a flawless experience for our customers.”
In June, I was invited to present and participate in a panel discussion at a special program on Big Data at Stevens Institute Technology in Hoboken, New Jersey.
But my role wasn’t to join the other speakers and help pay homage to the power and potential of Big Data. Rather, I was asked by the organizer, professor Lem Tarshis, to play “Devil’s Advocate,” and talk about the issues and challenges Big Data brings up.
Indeed, there has been some pushback taking place against Big Data, alleging that its potential for knowledge advancement is being over-promised, its legal implications not well understood, and the possibility it may be outright dangerous for business leaders to be basing decisions on erroneous assumptions.
I began my talk with a little bit of history – close to 30 years ago, to be exact:
On September 26, 1983, the United States was rebuilding its nuclear arsenal, the Soviet Union was still the Evil Empire, and there was no trust between the two superpowers. In fact, the leaders of the Soviet Union were almost paranoid that the U.S. was planning a surprise attack against them. NATO was conducting war exercises at the time. Everyone was on hair-trigger alert. On the night of September 26th, the officer in charge of the Soviet Air Defense Forces was ill, so another officer, Stanislav Yevgrafovich Petrov, filled in.
Not long after the shift started, the center received a warning from one of its satellites that an ICBM missile launch has just taken place from the United States. All the systems were flashing red. Petrov looked at it and decided: It’s just one missile. If they were attacking, they wouldn’t just launch a single missile. So he overrode the attack warning. But then the center was alerted that a second missile had been launched from the midwestern U.S. Still, Petrov was undaunted. Then, there were alarms for a third launch. Then a fourth launch. Then a fifth launch.
I imagine many Soviet apparatchiks would have reflexively hit that red launch button at that point. But Petrov kept his cool. He had no information confirming whether the US launch reports were real or erroneous. He only had his gut at that moment. But something in his gut told him that this wasn’t the real thing. And he chose not to put through an order for a massive Soviet missile retaliation.
It turns out Petrov’s gut instinct was correct, of course. The stationary Soviet satellite above the continental U.S. was actually picking up glints of sunlight that were coming over the horizon, and mistaking it for missile launches. The data that was streaming into the Soviet command center was erroneous data.
But that was 1983, a long time ago, right with old Soviet technology? Our systems and data feeds are all perfect and flawless now, right?
Well, technology is more advanced, and yes, misreading Big Data doesn’t have to mean the end of the world. But perhaps every organization could use a Stanislav Petrov on staff. Someone who thinks critically, who can question the results the data is providing and put it into context.
Consider how, just a couple of months ago, someone highjacked the AP Twitter account with a false report of an attack on the White House. Sensing the immediate swoon in stocks, the high-frequency trading algorithms kicked into high gear and sent major US stock indexes into a nosedive, all in three minutes time.
A recent survey of 300 financial executives released by Experian finds that most executives feel they lack enough accurate information to successfully perform daily operations or make decisions. The main challenges identified by respondents are outdated information, linking different sources of information and inaccurate data. On average, companies thought that 25 percent of their data was inaccurate. Only 13 percent of companies thought the problems with their data were small enough that it did not require further investment.
In big data scenarios, you have managers not trained in statistics making bet-the-business decisions based on data of unknown quality originating from unvetted sources. Data analysts and scientists can write the algorithms that extract the data, but they aren’t necessarily in a position to understand the business implications.
That’s why, even though Big Data analytics is providing a lot of new types of information organizations can act on, business leaders and managers need to still understand the sources of this data, and how systems are delivering the information they will bet the business on. What is the source of the information? Are there other potential sources that will help build a conclusion? And, very importantly: What is the context of this data?
To be successful at Big Data, it’s incumbent upon organizations to encourage critical thinking among business users of the data.
No matter how well integrated and powerful your back-end resources may be for managing Big Data, it’s all for naught if information can’t be effectively delivered and presented over that last 100 feet to decision-makers. It’s kind of like having a sophisticated power grid supporting the generation and transmission of electricity, but the consumer at home can’t figure out where the switch is to turn on the lamp.
That’s where data visualization can make all the difference. Yes, graphical displays of data have been around for more than a couple of decades now. I remember back in earliest days of the PC revolution using a package called Harvard Graphics, which did a nice job of converting rows and columns of data into nice, snazzy bar charts or pie charts. The spreadsheet makers recognized the power of visual representations and data, and also incorporated graphical capabilities into their products.
Now, there is an emerging class of front-end visualization tools that convert data points into visual displays – often stunning – that enable users to spot anomalies or trends in seconds. They are also referred to as 3D visualizations, but there is also the fourth dimension involved as well – time. Interfaces can be moved back in time – or forward if predictive analytics is available – to show how selected scenarios will change within a specified timeline.
If you want an illustration of what visualization can look like for enterprises, let’s broaden our horizons for a moment – really broaden our horizon. The Google Data Arts Team recently designed an interactive 3D map of the universe called “100,000 Stars.”
The 100,000 Stars interface enables you to zoom in on our own planet, then zoom out to the solar system, with our Sun at the center, then zoom over to the closest adjoining star and its solar system. Click on specific stars and planets, and you will get a brief description. Zoom out further, and you see we’re actually in one of the arms of the pinwheel of the Milky Way galaxy.
Imagine similar visualizations for business problem and opportunity areas, and you get what I mean – it’s out of this world. You can plot your data points, as well as even plot time to see how trends unfold. You already see this with those weather maps that move two, three days into the future. It turns the data into a physical object that you can view from different angles or timespans. It really brings data alive, and drives home any points that need to be made.
And we’re not just talking about “spacy” visualizations either. You may have seen, on some websites, the use of word “clouds,” for example. These terms getting the most usage are in the largest fonts, so at a glance, a view can see what the hottest topic may be.
In his latest book, Data Points: Visualization That Means Something, Nathan Yau makes the case for applying visualization against the toughest business and societal problems, as well as to uncover new opportunities that could not be considered previously in our flatter, 2D world. Ultimately, with data visualization, one can’t help but spot the trend or anomaly almost instantaneously:
“When you look at visualization for the first time, your eyes dart around trying to find a point of interest. Actually, when you look at anything, you tend to spot things that stand out, such as bright colors, shapes that are bigger than the rest, or people who are on the long tail of the height curve.”
In report published in 2011, Tony DeSantis, Mathew Gentile and Rich Simon, all of Deloitte, provide down-to-earth, on-the-ground example of how visualization can deliver business value: spotting potentially fraudulent invoices within an enterprise accounts payable department. “A traditional detection technique would be to list the invoice or purchase order numbers on a spreadsheet and sort them to identify numbers that are repeated, occur out of sequence, or increase by unusually small amounts over time, which such that the vendor has few or no other customers,” they point out. A visual graphic, on the other hand, will quickly make such anomalies blindingly obvious.
As DeSantis and his team put it: “Visual analytics builds on humans’ natural ability to absorb a greater volume of information in visual than in numeric form, and to perceive certain patterns, shapes and shades more easily than others. Using mathematical techniques to evaluate patterns and outliers, effective visuals can translate multidimensional data such as frequency, time and relationships into an intuitive picture.”
Big Data isn’t a technology or solution set that gets dropped into organization, ready to deliver compelling insights that will put the business on an upward trajectory of intelligence and prosperity. Rather, it is a gradually building wave that organization’s leaders will need to learn to ride, or else get swamped on the sidelines. Understanding and working effectively with big data will take a lot of practice.
That’s the theme of a new book co-authored by Michael Minelli, vice president of information services for MasterCard Advisors, along with Michele Chambers, formerly general manager and VP of Big Data analytics at IBM, and Ambiga Dhiraj, head of client delivery for Mu Sigma.
In the book, Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today’s Businesses, Minelli, Chambers and Dhiraj lay out the ways organizations can prepare to consume big data analytics.
1) Consider who is handling the “last mile” in data analysis: You need people who can look at the big picture with big data, and be able to explain its implications to the business. The authors quote Dr. Usama Fayyad, who talks about the crucial last mile in data analytics – the people “who are basically there to deliver the results of the analysis and put them in terms the business can understand. This last-mile group is made up of data analysts who know enough about the business to present to the CMO or the CEO.” At issue is the ability to find and hire these people, which is not an easy task. Also, a mistake many organizations make is putting these people to work on tactical assignments. “That’s a mistake, because these are people who can help develop and guide strategy, move the needle, and grapple with big issues,” Fayyad is quoted as saying.
2) Introduce the power of “geospatial intelligence”: Geospatial intelligence involves the gathering and analysis of data to form more of a 3D view of what’s happening around the organization. It’s about “using data about space and time to improve the quality of predictive analysis.” Minelli and his co-authors quote IBM’s Jeff Jonas: “It’s going to come from weaving together data that has traditionally not been woven together.” This means location data generated from sensors and smartphones, as well as social media data.
3) Separate the signal from the noise: With so much data and extremely large datasets, there’s going to be a lot of noise, with a lot of conflicting signals. “As data gets larger, it becomes increasingly difficult to fully grasp the meaning and magnitude of the data through exploratory analysis.” the authors state. The best way to help analysts decipher the nuggets of information needed is through visualization tools. For example, a “word cloud” of relevant terms plucked from a site or journal – and the most mentions, the larger the font – will provide, at a glance, the topics mentioned most often.
4) Collaborate: “successful analytics is a collaborative endeavor,” Minelli and his co-authors state. The first step in the process is to take your analytics intent beyond your core team and sell it to a wider group of decision makers – the prospective daily consumers of analytics in your organization.”
5) Learn to lead: “organizations that successfully consume analytics are driven by leadership, which builds consensus in the organization and allows for moving ahead without the need to have everyone on board every step of the way,” the authors state. “Strong leadership has been found to be the most important trigger in the wider analytics adoption in organizations.”
6) Measure, measure, measure: “Use analytics to measure itself,” Minelli and his co-authors urge. They add that hard numbers actually aren’t necessary to gauge any progress – the availability of analytics may elevate discussions and awareness of what the business needs. “One often but profound change in organizations is the maturing of a culture of objective debates, arguments and viewpoints driven by data and not just ‘gut feel,’” the authors state.
7) Change your incentives: Big data analytics implementations will shake up the organization will shake up the flow of information across the organizations, and thus re-arrange the hierarchy. Such projects will “bring in new stakeholders in employees’ decisions as well as higher levels of oversight,” the authors point out. “Sometimes, a general tendency of status quo bias exists, and employees do want to venture out of their comfort zone. You need to create robust incentives to overcome these barriers.”
Hosting Big Data applications in the cloud has compelling advantages. Scale doesn’t become as overwhelming an issue as it is within on-premise systems. IT will no longer feel compelled to throw more disks at burgeoning storage requirements, and performance becomes the contractual obligation of someone else outside the organization.
Cloud may help clear up some of the costlier and thornier problems of attempting to manage Big Data environments, but it also creates some new issues. As Ron Exler of Saugatuck Technology recently pointed out in a new report, cloud-based solutions “can be quickly configured to address some big data business needs, enabling outsourcing and potentially faster implementations.” However, he adds, employing the cloud also brings some risks as well.
Data security is one major risk area, and I could write many posts on this. But management issues also present other challenges. Too many organizations see cloud as an cure-all for their application and data management ills, but broken processes are never fixed when new technology is applied to them. There are also plenty of risks with the misappropriation of big data, and the cloud won’t make these risks go away. Exler lists some of the risks that stem from over-reliance on cloud technology, from the late delivery of business reports to the delivery of incorrect business information, resulting in decisions based on incorrect source data. Sound familiar? The gremlins that have haunted data analytic and management for years simply won’t disappear behind a cloud.
Exler makes three recommendations for moving big data into cloud environments – note that the solutions he proposes have nothing to do with technology, and everything to do with management:
1) Analyze the growth trajectory of your data and your business. Typically, organizations will have a lot of different moving parts and interfaces. And, as the business grows and changes, it will be constantly adding new data sources. As Exler notes, “processing integration or hand off points in such piecemeal approaches represent high risk to data in the chain of possession – from collection points to raw data to data edits to data combination to data warehouse to analytics engine to viewing applications on multiple platforms.” Business growth and future requirements should be analyzed and modeled to make sure cloud engagements will be able “to provide adequate system performance, availability, and scalability to account for the projected business expansion,” he states.
2) Address data quality issues as close to the source as possible. Because both cloud and big data environments have so many moving parts, “finding the source of a data problem can be a significant challenge,” Exler warns. “Finding problems upstream in the data flow prevent time-consuming and expensive reprocessing that could be needed should errors be discovered downstream.” Such quality issues have a substantial business cost as well. When data errors are found, it becomes “an expensive company-wide fire drill to correct the data,” he says.
3) Build your project management, teamwork and communication skills. Because big data and cloud projects involve so many people and components from across the enterprise, requiring coordination and interaction between various specialists, subject matter experts, vendors, and outsourcing partners. “This coordination is not simple,” Exler warns. “Each group involved likely has different sets of terminology, work habits, communications methods, and documentation standards. Each group also has different priorities; oftentimes such new projects are delegated to lower priority for supporting groups.” Project managers must be leaders and understand the value of open and regular communications.