A few years back, there was a movement in some businesses to establish “data stewards” – individuals who would sit at the hearts of the enterprise and make it their job to assure that data being consumed by the organization is of the highest possible quality, is secure, is contextually relevant, and capable of interoperating across any applications that need to consume it. While the data steward concept came along when everything was relational and structured, these individuals are now earning their pay when it comes to managing the big data boom.
The rise of big data is creating more than simple headaches for data stewards, it is creating turf wars across enterprises. As pointed out in a recent article in The Wall Street Journal, there isn’t yet a lot of clarity as to who owns and cares for such data. Is it IT? Is it lines of business? Is it legal? There are arguments that can be made for all jurisdictions.
In organizations these days, for example, marketing executives are generating, storing and analyzing large volumes of their own data within content management systems and social media analysis solutions. Many marketing departments even have their own IT budgets. Along with marketing, of course, everyone else within enterprises is seeking to pursue data analytics to better run their operations as well as foresee trends.
Typically, data has been under the domain of the CIO, the person who oversaw the collection, management and storage of information. In the Wall Street Journal article, however, it’s suggested that legal departments may be the best caretakers of big data, since big data poses a “liability exposure,” and legal departments are “better positioned to understand how to use big data without violating vendor contracts and joint-venture agreements, as well as keeping trade secrets.”
However, legal being legal, it’s likely that insightful data may end up getting locked away, never to see the light of day. Others may argue IT department needs to retain control, but there again, IT isn’t trained to recognize information that may set the business on a new course.
Focusing on big data ownership isn’t just an academic exercise. The future of the business may depend on the ability to get on top of big data. Gartner, for one, predicts that within the next three years, at least of a third of Fortune 100 organizations will experience an information crisis, “due to their inability to effectively value, govern and trust their enterprise information.”
This ability to “value, govern and trust” goes way beyond the traditional maintenance of data assets that IT has specialized in over the past few decades. As Gartner’s Andrew White put it: “Business leaders need to manage information, rather than just maintain it. When we say ‘manage,’ we mean ‘manage information for business advantage,’ as opposed to just maintaining data and its physical or virtual storage needs. In a digital economy, information is becoming the competitive asset to drive business advantage, and it is the critical connection that links the value chain of organizations.”
For starters, then, it is important that the business have full say over what data needs to be brought in, what data is important for further analysis, and what should be done with data once it gains in maturity. IT, however, needs to take a leadership role in assuring the data meets the organization’s quality standards, and that it is well-vetted so that business decision-makers can be confident in the data they are using.
The bottom line is that big data is a team effort, involving the whole enterprise. IT has a role to play, as does legal, as do the line of business.
Which comes first: innovation or analytics?
Bain & Company released some survey findings a few months back that actually put a value on big data. Companies with advanced analytic capabilities, the consultancy finds, are twice as likely to be in the top quartile of financial performance within their industries; five times as likely to make decisions much faster than market peers; three times as likely to execute decisions as intended; and twice as likely to use data very frequently when making decisions.
This is all good stuff, and the survey, which covered the input of 400 executives, makes a direct correlation between big data analytics efforts and the business’s bottom line. However, it begs a question: How does an organization become one of these analytic leaders? And there’s a more brain-twisting question to this as well: would the type of organization supporting an advanced analytics culture be more likely to be ahead of its competitors because its management tends to be more forward-thinking on a lot of fronts, and not just big data?
You just can’t throw a big data or analytics program or solution set on top of the organization (or drop in a data scientist) and expect to be dazzled with sudden clarity and insight. If an organization is dysfunctional, with a lot of silos, fiefdoms, or calcified and uninspired management, all the big data in the world isn’t going to lift its intelligence quota.
The author of the Bain and Company study, Travis Pearson and Rasmus Wegener, point out that “big data isn’t just one more technology initiative” – “in fact, it isn’t a technology initiative at all; it’s a business program that requires technical savvy.”
Succeeding with big data analytics requires a change in the organization’s culture, and the way it approaches problems and opportunities. The enterprise needs to be open to innovation and change. And, as Pearson and Wegener point out, “you need to embed big data deeply into your organization. It’s the only way to ensure that information and insights are shared across business units and functions. This also guarantees the entire company recognizes the synergies and scale benefits that a well-conceived analytics capability can provide.”
Pearson and Wegener also point to the following common characteristics of big data leaders they have studied:
Pick the “right angle of entry”: There are many areas of the business that can benefit from big data analytics, but just a few key areas that will really impact the business. It’s important to focus big data efforts on the right things. Pearson and Wegener say there are four areas where analytics can be relevant: “improving existing products and services, improving internal processes, building new product or service offerings, and transforming business models.”
Communicate big data ambition: Make it clear that big data analytics is a strategy that has the full commitment of management, and it’s a key part of the organization’s strategy. Messages that need to be communicated: “We will embrace big data as a new way of doing business. We will incorporate advanced analytics and insights as key elements of all critical decisions.” And, the co-authors add, “the senior team must also answer the question: To what end? How is big data going to improve our performance as a business? What will the company focus on?”
Sell and evangelize: Selling big data is a long-term process, not just one or two announcements at staff meetings. “Organizations don’t change easily and the value of analytics may not be apparent to everyone, so senior leaders may have to make the case for big data in one venue after another,” the authors caution. Big data leaders, they observe, have learned to take advantage of the tools at their disposal: they “define clear owners and sponsors for analytics initiatives. They provide incentives for analytics-driven behavior, thereby ensuring that data is incorporated into processes for making key decisions. They create targets for operational or financial improvements. They work hard to trace the causal impact of big data on the achievement of these targets.”
Find an organizational “home” for big data analysis: A common trend seen among big data leaders is that they have created an organizational home for their advanced analytics capability, “often a Center of Excellence overseen by a chief analytics officer,” according to Pearson and Wegener. This is where matters such as strategy, collection and ownership of data across business functions come into play. Organizations also need to plan how to generate insights, and prioritize opportunities and allocation of data analysts’ scientists’ time.
There is a hope and perception that adopting data analytics will open up new paths to innovation. But it often takes a innovative spirit to open up analytics.
Let’s face it, big data – or data in any size, format or shape – is nothing more than just a bunch of digital bits that occupy space on a disk somewhere. To be useful to the business, end-users need to be able to access it, and pull out and assemble the nuggets of information they need. Data needs to be brought to life.
That’s the theme of a webcast I recently had the opportunity to co-present with Tableau Software, titled “Making Big Data User-Centric.” In fact, there’s a lot more to it than making data user-centric – big data should be a catalyst that fires peoples’ imaginations, enabling them to explore new avenues that were never opened up before.
Many organizations are beginning their journey into the new big data analytics space, and are starting to discover all the possibilities it offers. But, in an era where data is now scaling into the petabyte range, it’s more than technology. It’s a disruptive force, and with disruption comes new opportunities for growth.
Here are nine ways to make this innovative disruption possible:
1. Remember that “data” is not “information.” Too many people think that data itself is a valuable commodity. However, that is like taking oil right out of the ground and trying to sell it at gas stations – it’s not usable. It needs to be processed, refined, and packaged for delivery. It needs to be unified for eventual delivery and presentation. And, finally, to give information its value, it needs to tell a story.
2. Make data sharable across the enterprise. Big data – like all types of data – tend to naturally drift into silos within departments across enterprises. For years, people have struggled to break down these silos and provide a single view of all relevant data. Now there’s a away to do it – through a unified service layer. Think of all the enterprisey things coming to the forefront in recent years – service oriented architecture, data virtualization, search technologies. No matter how you do it, the key is to provide a way for data to be made available across enterprise walls.
3. Use analytics to push the innovation envelope. Big data analytics enables end-users to ask questions and consider options that weren’t possible within standard, relational data environments.
4. Encourage critical thinking among data users. Business users have powerful tools at their disposal, and access to data they’ve never had before. It’s more important than ever to consider where the information came from, its context, and other potential sources that are not in the enterprise’s data stream.
5. Develop analytical skills across the board. Surveys I have conducted in partnership with Unisphere Research finds barely 10% of organizations offer self-service BI on a widespread basis. This needs to change. Everybody is working with information and data, everyone needs to understand the implications of the information and data with which they are working.
6. Promote self-service. Analytic capabilities should be delivered on a self-service basis. End-users are accustomed to information being delivered to them a Google speeds, making the processes they deal with at work – requesting reports from their IT departments, setting up queries – seem downright antiquated, as well as frustrating.
7. Make it visual. Yes, graphical displays of data have been around for more than a couple of decades now. But now, there is an emerging class of front-end visualization tools that convert data points into visual displays – often stunning – that enable users to spot anomalies or trends within seconds.
8. Make it mobile. Just about everyone now carries mobile devices from which they can access data from any place. It’s now possible to offering analytics ranging from key performance indicator marketing, drill-down navigation, data selection, data filtering, and alerts.
9. Make it social. There are two ways to look at big data analytics and social media. First, there’s the social media data itself. BI and analytics efforts would be missing a big piece of the picture if it did not address the wealth of social media data flowing through organizations. This includes sentiment analysis and other applications to monitor interactions on external social media sites, to determine reactions to new products or predict customer needs. But there’s also the collaboration aspect, the ability to share insights and discoveries with peers and partners. Either way, it takes many minds working together to effectively pull information from all that data.
In recent times, the big Internet companies – the Googles, Yahoos and eBays – have proven that it is possible to build a sustainable business on data analytics, in which corporate decisions and actions are being seamlessly guided via an analytics culture, based on data, measurement and quantifiable results. Now, two of the top data analytics thinkers say we are reaching a point that non-tech, non-Internet companies are on their way to becoming analytics-driven organizations in a similar vein, as part of an emerging data economy.
In a report written for the International Institute for Analytics, Thomas Davenport and Jill Dyché divulge the results of their interviews with 20 large organizations, in which they find big data analytics to be well integrated into the decision-making cycle. “Large organizations across industries are joining the data economy,” they observe. “They are not keeping traditional analytics and big data separate, but are combining them to form a new synthesis.”
Davenport and Dyché call this new state of management “Analytics 3.0, ” in which the concept and practices of competing on analytics are no longer confined to data management and IT departments or quants – analytics is embedded into all key organizational processes. That means major, transformative effects for organizations. “There is little doubt that analytics can transform organizations, and the firms that lead the 3.0 charge will seize the most value,” they write.
Analytics 3.0 is the current of three distinct phases in the way data analytics has been applied to business decision making, Davenport and Dyché say. The first two “eras” looked like this:
- Analytics 1.0, prevalent between 1954 and 2009, was based on relatively small and structured data sources from internal corporate sources.
- Analytics 2.0, which arose between 2005 and 2012, saw the rise of the big Web companies – the Googles and Yahoos and eBays – which were leveraging big data stores and employing prescriptive analytics to target customers and shape offerings. This time span was also shaped by a growing interest in competing on analytics, in which data was applied to strategic business decision-making. “However, large companies often confined their analytical efforts to basic information domains like customer or product, that were highly-structured and rarely integrated with other data,” the authors write.
- In the Analytics 3.0 era, analytical efforts are being integrated with other data types, across enterprises.
This emerging environment “combines the best of 1.0 and 2.0—a blend of big data and traditional analytics that yields insights and offerings with speed and impact,” Davenport and Dyché say. The key trait of Analytics 3.0 “is that not only online firms, but virtually any type of firm in any industry, can participate in the data-driven economy. Banks, industrial manufacturers, health care providers, retailers—any company in any industry that is willing to exploit the possibilities—can all develop data-based offerings for customers, as well as supporting internal decisions with big data.”
Davenport and Dyché describe how one major trucking and transportation company has been able to implement low-cost sensors for its trucks, trailers and intermodal containers, which “monitor location, driving behaviors, fuel levels and whether a trailer/container is loaded or empty. The quality of the optimized decisions [the company] makes with the sensor data – dispatching of trucks and containers, for example – is improving substantially, and the company’s use of prescriptive analytics is changing job roles and relationships.”
New technologies and methods are helping enterprises enter the Analytics 3.0 realm, including “a variety of hardware/software architectures, including clustered parallel servers using Hadoop/MapReduce, in-memory analytics, and in-database processing,” the authors adds. “All of these technologies are considerably faster than previous generations of technology for data management and analysis. Analyses that might have taken hours or days in the past can be done in seconds.”
In addition, another key characteristic of big data analytics-driven enterprises is the ability to fail fast – to deliver, with great frequency, partial outputs to project stakeholders. With the rise of new ‘agile’ analytical methods and machine learning techniques, organizations are capable of delivering “insights at a much faster rate,” and provide for “an ongoing sense of urgency.”
Perhaps most importantly, big data and analytics are integrated and embedded into corporate processes across the board. “Models in Analytics 3.0 are often being embedded into operational and decision processes, dramatically increasing their speed and impact,” Davenport and Dyché state. “Some are embedded into fully automated systems based on scoring algorithms or analytics-based rules. Some are built into consumer-oriented products and features. In any case, embedding the analytics into systems and processes not only means greater speed, but also makes it more difficult for decision-makers to avoid using analytics—usually a good thing.”
The report is available here.
With the growing prominence of big data as both a strategic and tactical resource for enterprises, there’s been a growing shift in the scope of business intelligence. Not too long ago, BI’s world was in tools that ran on individual workstations or PCs, providing filtered reports on limited sets of data, or stacking the data into analytical cubes.
Now, BI encompasses a range of data and analytics from across the enterprise, and is increasingly likely to be online, supported in the cloud, as it is in a local PC. However, as it has been for years, BI adoption still tends to be limited, not reaching its full potential. In a recent interview, BI analyst Cindi Howson, asks the question, what’s holding companies back from achieving a big impact with BI? In a recent Q&A with TDWI’s Linda Briggs, she discussed the issues raised in her new book, Successful Business Intelligence: Unlock the Value of BI and Big Data.
The success of BI depends, more than anything, on one factor, she says: corporate culture. Some organizations have achieved an analytic culture that reaches across their various business limes, but for many, it’s a challenge. “Leadership means not just the CIO but also the CEO, the lines of business, the COO, and the VP of marketing,” says Howson. “Culture and leadership are closely related, and it’s hard to separate one from the other.”
While corporate culture has always been important to success, it take on even a more critical role in efforts to compete on analytics. For example, she illustrates, “companies have a lot of data, and certainly they value the data, but there is sometimes a fear of sharing it. Once you start exposing the data, somebody’s job might be on the line, or it can show that someone made some bad decisions. Maybe the data will reveal that you’ve spent millions of dollars and you’re not really getting the returns that you thought you would in pursuing a particular market segment or product.”
It’s important to see an analytics culture as focusing on data as a tool to see problems and make course corrections, or act on opportunities – not to punish or expose individuals or departments.
Another point of corporate resistance is employing BI in the cloud, a challenge recently explored by Brad Peters, CEO of Birst. Here again, corporate culture may hold back efforts to move to the cloud, which offers greater scalability and availability for BI and analytics initiatives. In a recent interview in Diginomica, he says that IT departments, for example, may throw up roadblocks, for fear of being disintermediated. Plus, there is also a recognition that once BI data is in the cloud, it often gets “harder to work with.” Multi-tenant sites, for example, have security systems and protocols that may limit users’ ability to manipulate or parse the data.
The increasing adoption of cloud-based services – such as those from Amazon or Salesforce – are gradually melting resistance to the idea of cloud-based BI, Peters adds. He particular;y sees advantages for geographically-dispersed workforces.”
For his part, he admits that “has never been under any illusion that the shift of enterprise analytics to the cloud was going to happen overnight.”
Big Data means many things to many people – it all depends on their place and perspective in the organization. But there is something for everyone.
I recently explored the advantages being seen across the enterprise in a recent special report in Database Trends & Applications, and have distilled the key points below:
For data managers, it’s all about choice. The rise of the Big Data environment has brought with it a new generation of solutions, including open source, NoSQL and NewSQL databases – not to mention Apache Hadoop and cloud-based data environments. Big Data is extremely accessible now because of low-cost solutions to capturing and analyzing unstructured forms of data that haven’t been available until recently. Consider all the sensor data – from RFID tags, from machines – that’s been floating around for the past decade. Previously, capturing and managing such data was never cheap. Now with more more inexpensive databases and tools such as Hadoop, such data is now within the realm of the smallest organizations. In addition, cloud provides almost unlimited capacity, and can support and provide big data analytics in a way that is prohibitive for most organizations.
For data scientists, analysts and quants in organizations, it’s all about capabilities. The new Big Data world is all about diving deep into datasets and being able to engage in storytelling as a way to make data come alive for the business. Open source plays a key role, through frameworks such as Hadoop and MapReduce. There is also the highly versatile R language, which is well-suited for building analytics against large and highly diverse data sets. Predictive analytics is also is another key capability made real by Big Data.
For business users across the enterprise, it’s all about collaboration. There has been a growing movement to open up analytics across the organization – pushing Big Data analysis down to all levels of decision-makers, from front-line customer service representatives to information workers. New capabilities such as cloud services, visualization and self-service enable end users without statistical training to build their own queries and draw at their own conclusions. Along with user-friendly interfaces to Big Data, there’s been a rise in pervasive BI and analytics running in the background, embedded within applications or devices, in which the end-user is oblivious to the software and data sources feeding the applications. Cloud opens up business intelligence and analytics to more users. In addition, more organizations are focusing on providing Big Data analytics through apps on mobile devices, accelerating the move toward simplified access.
For the members of the executive suite, it’s all about competitiveness. Most executives grasp the power that big data can bring to their operations, especially with performance analytics, predictive analytics and customer analytics. Employing these analytics against Big Data means better understanding customers and markets, as well as becoming aware of trends that are still bubbling beneath the surface.
The greatest challenge to big data management and analysis isn’t necessarily the technical underpinnings, but rather, lingering executive confusion and uncertainty about what it is and what it can do for their organizations.
The main issue – and root of executive befuddlement – is the abject and ongoing confusion about what, exactly, is meant by “big data.” It’s certainly a hyped-up term for something that has been around for a long time. If you had a one-terabyte database around the turn of the century, you had big data, that’s for sure. If you had a 500-megabyte database in 1990, that would have been big data.
So the “volume” has always been there, and has always been a relative measure. The same goes for the “variety” aspect of big data. Unstructured data – such as word documents or machine log data – has been floating around organizations for decades now. How about the “velocity”? Real-time processing has been on corporate radar screens for well over a decade.
So, what’s changed that we suddenly see this data as an enabler, a game-changer, opening up the gates to a brave new world of analtyics-driven purpose? The rise of relatively cheap open-source tools and platforms for one. Capturing and analyzing large volumes of fast-moving data of various structures required very expensive equipment and consulting assistance. The expensive consultants may still be needed, but the technology is within reach for many organizations.
With this in mind, it is interesting to see that business leaders are warming up to the possibilities of big data, as a new industry survey shows. But what it is exactly they think they’re warming up to is still a big question mark. The survey, conducted among 500 business and IT executives by CompTIA, shows the big data phenomenon has caught the eyes of executives. The vast majority of organizations, 78%, say they feel more positive about big data as a business initiative this year compared to a similar survey conducted a year ago. And, remarkably, 57% feel they’ve made progress in moving in the right direction with data-driven programs, compared with 37% the year before.
To its credit, the CompTIA study’s authors question how accurately these findings actually translate to progress on the big data front: They note that while this years’ survey finds 42% of respondents claiming to be engaged in some of big data initiative – more than double from a year ago (19%) – such initiatives may be “big data” in name only. “This may stem from confusion or reflect the possibility of different users interpreting the concept of big data in different ways,” they observe.
So what we have is a lot of organizations diving into what they see as “big data” projects because that’s what everybody tells them they should be doing. But how much of this is simply the same types of data management and analytics projects that may have been engaged five, 10 years ago?
To really be making the most of big data as we understand it today, organizations should be addressing the following questions:
How much unstructured data is coursing through the organization, and how much of it is worth harvesting? It’s usually easy to measure the amount of structured data, such as that stored in relational databases or data warehouses, but unstructured data is a huge question mark. In many cases, management is clueless about what types of unstructured assets (user-generated files, machine-generated data) are actually available. It’s going to take a lot of research and discovery to uncover the unstructured data assets that are truly meaningful for the business.
Does the current data architecture support the introduction and integration of data sources? Most traditional data architectures are fairly rigid, built to support the inputs and outputs of relational data. Efforts involving other forms of data tend to be one-off projects, in which connectors or interfaces are hand-built built for a single purpose and then forgotten. Reaching out and exploring new and varied types of data require an architecture in which new sources can be rapidly and seamlessly introduced, without the usual silos.
Is the organization moving to an analytics culture? Big data analytics will never be “big” if it only is available to a few select decision makers or analysts. Big data will pack its punch when it enables decision makers at all levels of the organization – from customer care centers to production floors to the executive suite – to access analytics from various data sources. Even more helpful would be a way in which decision-makers can access analytical tools and back-end data sources through self-service approaches.
“Raw data is both an oxymoron and a bad idea. On the contrary, data should be cooked with care.” This was a statement made by Geoff Bowker in 2005, and served as the opening lines of a recent talk by Kate Crawford, principal researcher at Microsoft Research New England and a visiting MIT professor, who urged that big data be adopted and handled cautiously.
In her keynote at the recent DataEDGE 2013 conference, held at the University of California at Berkeley, Crawford said the time is now to have a discussion on the implications big data is having on business and society.
She outlined the six myths that have arisen around big data:
Myth #1: Big data is new. References to big data began to pop up in the literature in the late 1990s, but this is something some prominent industries, such as financial services firms and oil companies, have been wrestling with for decades, Crawford says. What is new, however, “is the fact that a lot of the tools of big data are becoming more easily reached by a lot more people. We’re having an explosion in ideas, creativity and imagination in terms of what we can do with these technologies.” This is the time to discuss the implications of big data, she adds, because much of it will be invisible within a few years as the tools and technologies mature. “Really usable systems and really good technologies disappear,” she states. “The easier they are to use, the harder they are to see.”
Myth #2: Big data is objective. Actually, big data sets can be very biased, Crawford states. For example, she says, she poured through 20 million tweets sent out about Hurricane Sandy, which flooded her neighborhood in Manhattan last year. While the tweets tell a compelling story about how residents coped, they mainly represent the views of younger, more well-to Manhattan residents. “If we look a little closer at the tweets, most were coming out of Manhattan, which has a higher concentration of people using smartphone, and a higher concentration of Twitter users – a subset of a subset. There were very few tweets coming from the far more affected areas, such as Breezy Point or the Far Rockaways. Because we don’t have the data from those places, we essentially have very privileged urban stories. We have to be really clear who were talking about, we have to think about what this data really represents,” she says.
Myth #3: Big data doesn’t discriminate. “There’s a myth that says essentially because you’re dealing with large data sets, you can somehow avoid group-level prejudice,” Crawford cautions. She pointed to a recent study of the Facebook “likes” of 60,000 people that found such data can be used to identify a person’s race, sexual orientation, religious views, political leanings, and even if they are a previous drug or alcohol user. “The researchers also expressed a set of concerns that this data can be bought by anyone. Ultimately, employees can make decisions about individuals based on this data.”
Myth #4: Big data makes cities smarter. While big data goes a long way to improve the management of city problems, it also may under-represent communities. “Not all data is created or collected equally – there are always certain communities of people who are going to be left out of those data sets,” Crawford says. For example, last year, the city of Boston released an app called StreetBump, which automatically registered potholes by passively collecting GPS data from drivers’ smartphones. The program collected a great deal of data on potholes. However, she adds, “wealthier younger citizens are more likely to have smartphones, and therefore, wealthy areas with younger people would get more attention, while areas with older residents with less money will get fewer resources.”
Myth #5: Big data is anonymous. Crawford cited a recent study, published in Nature, which determined that individuals could be identified with no more than four data points, including their cell phone number. Before the advent of personal technology, it took about 12 data points to identify an individual. “It’s very difficult to make data anonymous – even with two randomized data points, it’s possible to identify 50 percent of people.” Another big data initiative, the smart grid being adopted by electric utilities, will capture a wealth of data – from energy usage to “when you have friends over, when you are sleeping. This is some very intimate data.”
Myth #6: You can opt out of big data. There are suggestions that people will be able to protect their privacy is they pay a fee for web services to opt out of tracking, versus using services for free in exchange for giving up some information. Crawford cautions that this will result in a two-tier system, which “turns private data into a luxury good rather than a public good.”
Rather than making data privacy and management an individual choice, Crawford urges a more public discussion on “the way that the data is essentially flowing between corporations, individuals and governments.”
The drive to achieve competitive advantage with Big Data is creating a lot of interesting opportunities for managers and professionals working in the data analytics space. In some cases, new job categories not imaginable just a few years back are being created, and are in demand. “Data scientist” is only one of these descriptions, and there are jobs that don’t require Ph.Ds in statistics. Sometimes, it just takes a little creative thinking to move one’s career in a new and different direction.
In a recent Forbes post, J. Maureen Henderson, head of a market research firm, discussed the ways emerging Big Data expertise is being leveraged for business problems. One University of Tennessee student, for example, is pursuing a post-graduate degree in Big Data analytics to “tell stories from data,” noting that in a previous job, she saw that “there were plenty of talented ‘data’ people and plenty of the talented ‘business’ people; however, the people who could do both were extraordinarily valuable to the firm and to my team’s ability to solve problems. That really got my wheels turning, and I started thinking about what other problems I might be able to solve if I knew more about analytics.”
In the process of exploring the avenues by which big data will deliver value to businesses, some interesting new job titles and descriptions are emerging across the industry. The new generation of jobs being spurred by Big Data are often a blend of stats-savvy and business-savvy skillsets and activities.
Here is a sampling of a few of these blended positions that have recently appeared at online recruiting sites:
Industry analytics manager (pharmaceutical): “Collaborate with cross-functional partners in industry analytics, market analysis and strategy, the managed care contracting organization and brand marketing teams to consult with and deliver deep insights and actionable strategic and tactical recommendations on access and reimbursement drivers of the business. Demonstrate ability to break complex problems down into distinct parts, simplify complexity, and manage uncertainty.”
Data scientist/machine learning expert (online consumer site): “Data science team is looking for a data scientist to work on machine learning, data mining and information retrieval problems. Perform complex analysis on very large data sets using data mining, machine learning and graph analysis algorithms. Build complex predictive models that scale to petabytes of data. Define metrics, understand A/B testing and statistical measurement of model quality. Work closely with product, engineering, and marketing teams to identify, collect and analyze data to answer critical product questions.”
Data anthropologist (marketing firm): “Leverage huge data set and powerful analytical tools to give the public more insight into the digital world. Keep up-to-the-minute with current events and understand the online ecosystem—publishers, consumers, advertisers, analysts, bloggers, vendors, journalists and others who make the internet buzz. Craft stories to create buzz, backed by our data, and share them as tweets or blog posts or press releases or white papers or whatever best suits the material.”
Data scientist/data lover (scholarship fund): “Define and implement the social media measurement strategies and business intelligence analytics that align with marketing and business objectives; perform qualitative, statistical and quantitative analysis; producing meaningful marketing KPI dashboards and delivering routine and ad hoc, cross-channel performance reports with actionable insight. The candidate should be able to identify correlations in cause and effect of email and online/MR and social media integrated campaigns resulting in increased individual donations and stakeholder engagement.”
Systems engineer – big data (game publisher): “Players continue to rack up billions of hours of play — all of it logged, all of the logs frankly rather useless until our lab-coated Big Data scientists work their black magics, transmogrifying unwieldy petabytes through the careful application of open-source and proprietary technologies and bucketloads of intellectual elbow grease. As Systems Engineer – Big Data you’ll provide ongoing support for data warehouse and data services infrastructure and systems, ensuring Big Data van keeps rolling, come hell, high water or technical difficulties. Your exceptional communication skills will help you as you smooth the transition from raw data to actionable insight about the players.”
Data visualization engineer, streaming platforms (streaming video provider): “Own and build new, high-impact visualizations in our insight tools to make data both understandable and actionable. Develop rich interactive graphics, data visualizations of large amount of structured data, preferably in-browser. To deliver operational insights quickly and effectively, we need an excellent suite of interactive tools and dynamic dashboards. Our goal is to raise our operational insight capabilities to a whole new level of excellence that enables us to continuously improve our product while ensuring a flawless experience for our customers.”
In June, I was invited to present and participate in a panel discussion at a special program on Big Data at Stevens Institute Technology in Hoboken, New Jersey.
But my role wasn’t to join the other speakers and help pay homage to the power and potential of Big Data. Rather, I was asked by the organizer, professor Lem Tarshis, to play “Devil’s Advocate,” and talk about the issues and challenges Big Data brings up.
Indeed, there has been some pushback taking place against Big Data, alleging that its potential for knowledge advancement is being over-promised, its legal implications not well understood, and the possibility it may be outright dangerous for business leaders to be basing decisions on erroneous assumptions.
I began my talk with a little bit of history – close to 30 years ago, to be exact:
On September 26, 1983, the United States was rebuilding its nuclear arsenal, the Soviet Union was still the Evil Empire, and there was no trust between the two superpowers. In fact, the leaders of the Soviet Union were almost paranoid that the U.S. was planning a surprise attack against them. NATO was conducting war exercises at the time. Everyone was on hair-trigger alert. On the night of September 26th, the officer in charge of the Soviet Air Defense Forces was ill, so another officer, Stanislav Yevgrafovich Petrov, filled in.
Not long after the shift started, the center received a warning from one of its satellites that an ICBM missile launch has just taken place from the United States. All the systems were flashing red. Petrov looked at it and decided: It’s just one missile. If they were attacking, they wouldn’t just launch a single missile. So he overrode the attack warning. But then the center was alerted that a second missile had been launched from the midwestern U.S. Still, Petrov was undaunted. Then, there were alarms for a third launch. Then a fourth launch. Then a fifth launch.
I imagine many Soviet apparatchiks would have reflexively hit that red launch button at that point. But Petrov kept his cool. He had no information confirming whether the US launch reports were real or erroneous. He only had his gut at that moment. But something in his gut told him that this wasn’t the real thing. And he chose not to put through an order for a massive Soviet missile retaliation.
It turns out Petrov’s gut instinct was correct, of course. The stationary Soviet satellite above the continental U.S. was actually picking up glints of sunlight that were coming over the horizon, and mistaking it for missile launches. The data that was streaming into the Soviet command center was erroneous data.
But that was 1983, a long time ago, right with old Soviet technology? Our systems and data feeds are all perfect and flawless now, right?
Well, technology is more advanced, and yes, misreading Big Data doesn’t have to mean the end of the world. But perhaps every organization could use a Stanislav Petrov on staff. Someone who thinks critically, who can question the results the data is providing and put it into context.
Consider how, just a couple of months ago, someone highjacked the AP Twitter account with a false report of an attack on the White House. Sensing the immediate swoon in stocks, the high-frequency trading algorithms kicked into high gear and sent major US stock indexes into a nosedive, all in three minutes time.
A recent survey of 300 financial executives released by Experian finds that most executives feel they lack enough accurate information to successfully perform daily operations or make decisions. The main challenges identified by respondents are outdated information, linking different sources of information and inaccurate data. On average, companies thought that 25 percent of their data was inaccurate. Only 13 percent of companies thought the problems with their data were small enough that it did not require further investment.
In big data scenarios, you have managers not trained in statistics making bet-the-business decisions based on data of unknown quality originating from unvetted sources. Data analysts and scientists can write the algorithms that extract the data, but they aren’t necessarily in a position to understand the business implications.
That’s why, even though Big Data analytics is providing a lot of new types of information organizations can act on, business leaders and managers need to still understand the sources of this data, and how systems are delivering the information they will bet the business on. What is the source of the information? Are there other potential sources that will help build a conclusion? And, very importantly: What is the context of this data?
To be successful at Big Data, it’s incumbent upon organizations to encourage critical thinking among business users of the data.