Tag Archives: public sector
In the other, they hear administrative talk of smaller budgets and scarcer resources.
As stringent requirements for both transparency and accountability grow, this paradox of pressure increases.
Sometimes, the best way to cope is to TALK to somebody.
What if you could ask other data technologists candid questions like:
- Do you think government regulation helps or hurts the sharing of data?
- Do you think government regulators balance the privacy needs of the public with commercial needs?
- What are the implications of big data government regulation, especially for users?
- How can businesses expedite the government adoption of the cloud?
- How can businesses aid in the government overcoming the security risks associated with the cloud?
- How should the policy frameworks for handling big data differ between the government and the private sector?
What if you could tell someone who understood? What if they had sweet suggestions, terrific tips, stellar strategies for success? We think you can. We think they will.
That’s why Twitter needs a #DataChat.
What on earth is a #DataChat?
Good question. It’s a Twitter Chat – A public dialog, at a set time, on a set topic. It’s something like a crowd-sourced discussion. Any Twitter user can participate simply by including the applicable hashtag in each tweet. Our hashtag is #DataChat. We’ll connect on Twitter, on the third Thursday of each month to share struggles, victories and advice about data governance. We’re going to begin this week, Thursday April 17, at 3:00 PM Eastern Time. For our first chat, we are going to discuss topics that relate to data technologies in government organizations.
What don’t you join us? Tell us about it. Mark your calendar. Bring a friend.
Because, sometimes, you just need someone to talk to.
Loraine Lawson does an outstanding job of covering the issues around government use of “data heavy” projects. This includes a report by the government IT site, MeriTalk.
“The report identifies five factors, which it calls the Big Five of IT, that will significantly affect the flow of data into and out of organizations: Big data, data center consolidation, mobility, security and cloud computing.”
MeriTalk surveyed 201 state and local government IT professionals, and found that, while the majority of organizations plan to deploy the Big Five, 94 percent of IT pros say their agency is not fully prepared. “In fact, if Big Data, mobile, cloud, security and data center consolidation all took place today, 89 percent say they’d need additional network capacity to maintain service levels. Sixty-three percent said they’d face network bottleneck risks, according to the report.”
This report states what most who work with the government already know; the government is not ready for the influx of data. Nor is the government ready for the different uses of data, and thus there is a large amount of risk as the amount of data under management within the government explodes.
Add issues with the approaches and technologies leveraged for data integration to the list. As cloud computing and mobile computing continue to rise in popularity, there is not a clear strategy and technology for syncing data in the cloud, or on mobile devices, with data that exists within government agencies. Consolidation won’t be possible without a sound data integration strategy, nor will the proper use of big data technology.
The government sees a huge wave of data heading for it, as well as opportunities with new technology such as big data, cloud, and mobile. However, there doesn’t seem to be an overall plan to surf this wave. According to the report, if they do wade into the big data wave, they are likely to face much larger risks.
The answer to this problem is really rather simple. As the government moves to take advantage of the rising tide of data, as well as new technologies, they need to be funded to get the infrastructure and the technology they need to be successful. The use of data integration approaches and technologies, for example, will return the investment ten-fold, if properly introduced into the government problem domains. This includes integration with big data systems, mobile devices, and, of course, the rising use of cloud-based platforms.
While data integration is not a magic bullet for the government, nor any other organization, the proper and planned use of this technology goes a long way toward reducing the inherent risks that the report identified. Lacking that plan, I don’t think the government will get very far, very fast.
In the first two issues I spent time looking at the need for states to pay attention to the digital health and safety of their citizens, followed by the oft forgotten need to understand and protect the non-production data. This is data than has often proliferated and also ignored or forgotten about.
In many ways, non-production data is simpler to protect. Development and test systems can usually work effectively with realistic but not real PII data and realistic but not real volumes of data. On the other hand, production systems need the real production data complete with the wealth of information that enables individuals to be identified – and therefore presents a huge risk. If and when that data is compromised either deliberately or accidentally the consequences can be enormous; in the impact on the individual citizens and also the cost of remediation on the state. Many will remember the massive South Carolina data breach of late 2012 when over the course of 2 days a 74 GB database was downloaded and stolen, around 3.8 million payers and 1.9 million dependents had their social security information stolen and 3.3 million “lost” bank account details. The citizens’ pain didn’t end there, as the company South Carolina picked to help its citizens seems to have tried to exploit the situation.
The biggest problem with securing production data is that there are numerous legitimate users and uses of that data, and most often just a small number of potentially malicious or accidental attempts of inappropriate or dangerous access. So the question is… how does a state agency protect its citizens’ sensitive data while at the same time ensuring that legitimate uses and users continues – without performance impacts or any disruption of access? Obviously each state needs to make its own determination as to what approach works best for them.
This video does a good job at explaining the scope of the overall data privacy/security problems and also reviews a number of successful approaches to protecting sensitive data in both production and non-production environments. What you’ll find is that database encryption is just the start and is fine if the database is “stolen” (unless of course the key is stolen along with the data! Encryption locks the data away in the same way that a safe protects physical assets – but the same problem exists. If the key is stolen with the safe then all bets are off. Legitimate users are usually easily able deliberately breach and steal the sensitive contents, and it’s these latter occasions we need to understand and protect against. Given that the majority of data breaches are “inside jobs” we need to ensure that authorized users (end-users, DBAs, system administrators and so on) that have legitimate access only have access to the data they absolutely need, no more and no less.
So we have reached the end of the first series. In the first blog we looked at the need for states to place the same emphasis on the digital health and welfare of their citizens as they do on their physical and mental health. In the second we looked at the oft-forgotten area of non-production (development, testing, QA etc.) data. In this third and final piece we looked at the need to and some options for providing the complete protection of non-production data.
In my first article on the topic of citizens’ digital health and safety we looked at the states’ desire to keep their citizens healthy and safe and also at the various laws and regulations they have in place around data breaches and losses. The size and scale of the problem together with some ideas for effective risk mitigation are in this whitepaper.
Let’s now start delving a little deeper into the situation states are faced with. It’s pretty obvious that citizen data that enables an individual to be identified (PII) needs to be protected. We immediately think of the production data: data that is used in integrated eligibility systems; in health insurance exchanges; in data warehouses and so on. In some ways the production data is the least of our problems; our research shows that the average state has around 10 to 12 full copies of data for non-production (development, test, user acceptance and so on) purposes. This data tends to be much more vulnerable because it is widespread and used by a wide variety of people – often subcontractors or outsourcers, and often the content of the data is not well understood.
Obviously production systems need access to real production data (I’ll cover how best to protect that in the next issue), on the other hand non-production systems of every sort do not. Non-production systems most often need realistic, but not real data and realistic, but not real data volumes (except maybe for the performance/stress/throughput testing system). What need to be done? Well to start with, a three point risk remediation plan would be a good place to start.
1. Understand the non-production data using sophisticated data and schema profiling combined with NLP (Natural Language Processing) techniques help to identify previously unrealized PII that needs protecting.
2. Permanently mask the PII so that it is no longer the real data but is realistic enough for non-production uses and make sure that the same masking is applied to the attribute values wherever they appear in multiple tables/files.
3. Subset the data to reduce data volumes, this limits the size of the risk and also has positive effects on performance, run-times, backups etc.
Gartner has just published their 2013 magic quadrant for data masking this covers both what they call static (i.e. permanent or persistent masking) and dynamic (more on this in the next issue) masking. As usual the MQ gives a good overview of the issues behind the technology as well as a review of the position, strengths and weaknesses of the leading vendors.
It is (or at least should be) an imperative that from the top down state governments realize the importance and vulnerability of their citizens data and put in place a non-partisan plan to prevent any future breaches. As the reader might imagine, for any such plan to success needs a combination of cultural and organizational change (getting people to care) and putting the right technology – together these will greatly reduce the risk. In the next and final issue on this topic we will look at the vulnerabilities of production data, and what can be done to dramatically increase its privacy and security.