“What really matters about big data is what it does. Aside from how we define big data as a technological phenomenon, the wide variety of potential uses for big data analytics raises crucial questions about whether our legal, ethical, and social norms are sufficient to protect privacy and other values in a big data world.”
These crucial questions, raised in a recent White House report on the implications of big data, frame a growing debate taking place across both society and the business world on how far organizations can push the limits with data collection and analysis. The report, issued by a presidential commission tasked with assessing big data’s privacy implications, explains how big data is a double-edged sword. While big data analytics pave the way to unexpected discoveries, innovations, and advancements in our quality of life, it also has the potential for abuse as well. As the report puts it, big data’s capabilities, “most of which are not visible or available to the average consumer, also create an asymmetry of power between those who hold the data and those who intentionally or inadvertently supply it.”
The report’s authors acknowledge that big data analytics is an engine of economic growth and a competitive tool for companies across all industries, as well as a tool for quality of life. “Used well, big data analysis can boost economic productivity, drive improved consumer and government services, thwart terrorists, and save lives,” the report states. In addition, there will likely be a profound impact as data analytics gets applied to the Internet of Things, which “have made it possible to merge the industrial and information economies.” In another example, healthcare providers and payers can employ predictive analytics to detect fraud and abuse in real time.
The report’s main thrust is personal privacy implications, and many these issues will inevitably shape the practices and policies of enterprises as they expand their businesses into the big data realm. The managers and professionals charged with identifying, collecting and analyzing information assets will increasingly be under pressure – as their organizations feel pressure – to understand the boundaries between insight, targeted engagement, and overreach.
For example, a still relatively unexplored area of big data is its ownership. Does data belong to those who collect it, or those who contribute to it? “Big data may be viewed as property, as a public resource, or as an expression of individual identity,” the report states.
Another challenge is the fact that many organizations will opt to assemble massive databases as they move forward with big data analysis. “Big data technologies can derive value from large data sets in ways that were previously impossible — indeed, big data can generate insights that researchers didn’t even think to seek.” For example, new tools and technologies provide for analysis across entire data sets, versus extracting a small representative subset of the data and extrapolating any results against a larger universe. However, with so much data, analysis may potentially be erroneous as well. “Correlation still doesn’t equal causation,” the report’s authors state. “Finding a correlation with big data techniques may not be an appropriate basis for predicting out-comes or behavior, or rendering judgments on individuals. In big data, as with all data, interpretation is always important.”
Another issue is the permanence of data – which also is a privacy issue. At the same time, this may also create headaches for corporate data managers as well. “In the past, retaining physical control over one’s personal information was often sufficient to ensure privacy,” the report states. “Documents could be destroyed, conversations forgotten, and records expunged. But in the digital world, information can be captured, copied, shared, and transferred at high fidelity and retained indefinitely. Volumes of data that were once unthinkably expensive to preserve are now easy and affordable to store on a chip the size of a grain of rice. As a consequence, data, once created, is in many cases effectively permanent. Furthermore, digital data often concerns multiple people, making personal control impractical.”
The report’s authors state that organizations need to take steps to address privacy issues, and suggest de-identification and encryption as technical solutions that are available at this time. However, in the long run, de-identification is still a weak approach to the problem. “Many technologists are of the view that de-identification of data as a means of protecting individual privacy is, at best, a limited proposition. In practice, data collected and de-identified is protected in this form by companies’ commitments to not re-identify the data and by security measures put in place to ensure those protections.”
Ultimately, the best methods to ensure the ethical use of data need to come through inspired and forward-thinking management. It takes judicious management, a commitment to training and education, and a focus on what nuggets of information matter the most to the business. Big data opens up many new vistas for enterprises, and those that take the high road will reap its rewards.