Tag Archives: dynamic data masking
- PII – Personally Identifiable Information – any data that could potentially identify a specific individual. Any information that can be used to distinguish one person from another and can be used for de-anonymizing anonymous data can be considered PII
- GSA’s Rules of Behavior for Handling Personally Identifiable Information – This directive provides GSA’s policy on how to properly handle PII and the consequences and corrective actions that will be taken if a breach occurs
- PHI – Protected Health Information – any information about health status, provision of health care, or payment for health care that can be lined to a specific individual
- HIPAA Privacy Rule – The HIPAA Privacy Rule establishes national standards to protect individuals’ medical records and other personal health information and applies to health plans, health care clearinghouses, and those health care providers that conduct certain health care transactions electronically. The Rule requires appropriate safeguards to protect the privacy of personal health information, and sets limits and conditions on the uses and disclosures that may be made of such information without patient authorization. The Rule also gives patients rights over their health information, including rights to examine and obtain a copy of their health records, and to request corrections.
- Encryption – a method of protecting data by scrambling it into an unreadable form. It is a systematic encoding process which is only reversible with the right key.
- Tokenization – a method of replacing sensitive data with non-sensitive placeholder tokens. These tokens are swapped with data stored in relational databases and files.
- Data masking – a process that scrambles data, either an entire database or a subset. Unlike encryption, masking is not reversible; unlike tokenization, masked data is useful for limited purposes. There are several types of data masking:
- Static data masking (SDM) masks data in advance of using it. Non production databases masked NOT in real-time.
- Dynamic data masking (DDM) masks production data in real time
- Data Redaction – masks unstructured content (PDF, Word, Excel)
Each of the three methods for protecting data (encryption, tokenization and data masking) have different benefits and work to solve different security issues . We’ll address them in a bit. For a visual representation of the three methods – please see the table below:
For protecting PHI data – encryption is superior to tokenization. You encrypt different portions of personal healthcare data under different encryption keys. Only those with the requisite keys can see the data. This form of encryption requires advanced application support to manage the different data sets to be viewed or updated by different audiences. The key management service must be very scalable to handle even a modest community of users. Record management is particularly complicated. Encryption works better than tokenization for PHI – but it does not scale well.
Properly deployed, encryption is a perfectly suitable tool for protecting PII. It can be set up to protect archived data or data residing on file systems without modification to business processes.
- To protect the data, you must install encryption and key management services to protect the data – this only protects the data from access that circumvents applications
- You can add application layer encryption to protect data in use
- This requires changing applications and databases to support the additional protection
- You will pay the cost of modification and the performance of the application will be impacted
For tokenization of PHI – there are many pieces of data which must be bundled up in different ways for many different audiences. Using the tokenized data requires it to be de-tokenized (which usually includes a decryption process). This introduces an overhead to the process. A person’s medical history is a combination of medical attributes, doctor visits, outsourced visits. It is an entangled set of personal, financial, and medical data. Different groups need access to different subsets. Each audience needs a different slice of the data – but must not see the rest of it. You need to issue a different token for each and every audience. You will need a very sophisticated token management and tracking system to divide up the data, issuing and tracking different tokens for each audience.
Masking can scramble individual data columns in different ways so that the masked data looks like the original (retaining its format and data type) but it is no longer sensitive data. Masking is effective for maintaining aggregate values across an entire database, enabling preservation of sum and average values within a data set, while changing all the individual data elements. Masking plus encryption provide a powerful combination for distribution and sharing of medical information
Traditionally, data masking has been viewed as a technique for solving a test data problem. The December 2014 Gartner Magic Quadrant Report on Data Masking Technology extends the scope of data masking to more broadly include data de-identification in production, non-production, and analytic use cases. The challenge is to do this while retaining business value in the information for consumption and use.
Masked data should be realistic and quasi-real. It should satisfy the same business rules as real data. It is very common to use masked data in test and development environments as the data looks like “real” data, but doesn’t contain any sensitive information.
In recent conversations regarding solutions to implement for data privacy, our Dynamic Data Masking team put together the following table to highlight the differences between encryption / tokenization and Dynamic Data Masking (DDM). Best practices dictate that both should be implemented in an enterprise for the most comprehensive and complete data security strategy. For the purpose of this blog, here are a few definitions:
Dynamic Data Masking (DDM) protects sensitive data when it is retrieved based on policy without requiring the data to be altered when it is stored persistently. Authorized users will see true data, unauthorized users will see masked values in the application. No coding is required in the source application.
Encryption / tokenization protects sensitive data by altering its values when stored persistently while being able to decrypt and present the original values when requested by authorized users. The user is validated by a separate service which then provides a decryption key. Unauthorized users will only see the encrypted values. In many cases, applications need to be altered requiring development work.
|Business users access PII||Business users work with actual SSN and personal values in the clear (not with tokenized values). As the data is tokenized in the database, it needs to be de-tokenized every time it is accessed by users – which is done be changing the application source-code (imposing costs and risks), and causing performance penalty.For example, if a user needs to retrieve information on a client with SSN = ‘987-65-4329’, the application needs to de-tokenize the entire tokenized SSN column to identify the correct client info – a costly operation. This is why implementation scope is limited.||As DDM does not change the data in the database, but only masks it when accessed by unauthorized users, authorized users do not experience any performance hit nor require application source-code changes.For example, if an authorized user needs to retrieve information on a client with SSN = ‘987-65-4329’, his request is untouched by DDM. As the SSN stored in the database is not changed, there is no performance penalty involved.In case an unauthorized user retrieves the same SSN, DDM masks the SQL request, causing the sensitive data result (e.g., name, address, CC and age) to be masked, hidden or completely blocked.|
|Privileged Infrastructure DBA have access to the database server files||Personal Identifiable Information (PII) stored in the database files is tokenized, ensuring that the few administrators that have uncontrolled access to the database servers cannot see it||PII stored in the database files remains in the clear. The few administrators that have uncontrolled access to the database servers can potentially access it.|
|Production support, application developers, DBAs, consultants, outsource and offshore teams||These groups of users have application super-user privileges, seen by the tokenization solution as authorized, and as such access PII in the clear!!!||These users are identified by DDM as unauthorized, and as such are masked, hidden or blocked, protecting the PII.|
|Data warehouse protection||Implementing tokenization on Data warehouses requires tedious database changes and causes performance penalty:1.Loading or reporting upon millions of PII records requires to tokenize/de-tokenize each record.2.Running a report with a condition on a tokenized value (e.g., when having a condition: SSN like (‘%333’) causes the de-tokenization of the entire column).
Massive database configuration changes are required to use the tokenization API, creating and maintaining hundreds of views.
|No performance penalty.No need to change reports, databases or to create views.|
Combining both DDM and encryption/tokenization presents an opportunity to deliver complete data privacy without the need to alter the application or write any code.
Informatica works with its encryption and tokenization partners to deliver comprehensive data privacy protection in packaged applications, data warehouses and Big Data platforms such as Hadoop.
Informatica Recognized By Gartner as a Leader in Data Masking and by Infosecurity for Best Security Software
Informatica was named as a leader in the 2012 Gartner Magic Quadrant for Data Masking. A couple of weeks ago, Infosecurity named Informatica as a finalist for Best Security Software for 2013.
Both the Gartner Magic Quadrant for Data Masking and Infosecurity Products Guide recognized Informatica for continued innovation:
- Gartner states, “The data masking portfolio has been broadening. In addition to SDM technology… the market is beginning to offer dynamic data masking (DDM)… ” (more…)
Personally Identifiable Information is under attack like never before. In the news recently two prominent organizations—institutions—were attacked. What happened:
- A data breach at a major U.S. Insurance company exposed over a million of their policyholders to identity fraud. The data stolen included Personally Identifiable information such as names, Social Security numbers, driver’s license numbers and birth dates. In addition to Nationwide paying million dollar identity fraud protection to policyholders, this breach is creating fears that class action lawsuits will follow. (more…)
Given the reputation risk and the cost of security breaches, organizations know they should be implementing data privacy in all their environments—whether it’s in production or test and development.
But the question I often get is: I know we need to better data security, but how do I prioritize these projects above other projects?
As our customers have shared with us the time and cost savings they achieved using our software, we have created a business value assessment that uses those benchmarks to calculate the benefits organizations would achieve by implementing a solution like Informatica Data Privacy. This business value assessment is based on the best practices for managing the data privacy lifecycle and includes the following phases seen below.
For each of these phases, we have collected how our customers have benefited and used those figures as the basis for calculating benefits for any organization using the Informatica solution. We’ve also used industry benchmarks to calculate risk mitigation and hardware cost savings. Following are the benefits our customers have realized and map to the data privacy life cycle.
Accelerate Sensitive Data Discovery – Rapidly identify sensitive data across all legacy and packaged applications
Increase Development Productivity – Develop global masking rules more efficiently
Increase Testing Productivity – Reduce the time it takes to capture optimal test case data
Increase Quality – Use realistic data in QA and development to reduce later rework and fixes
Risk Mitigation – Avoid breaches, reducing victim notification costs, fines
Hardware Reduction – Subset and create smaller copies of production for test purposes, reducing storage costs
Increase Compliance Reporting Productivity – Prove compliance through automated reports on masked data
Outsourcing Savings – Because data is masked, companies can then outsource application development or support.
As a result of using this type of assessment early in their project cycle, our customers have successfully made the case to prioritize these data privacy projects.
Let us know if you want to talk about the Business Value Assessment for Data Privacy— so you too can say, “I know we need to mitigate risk—and here’s how we can minimize the costs of doing so.”
In a May 2012 survey by the Ponemon Institute, 66 percent said they are not confident their organization would be able to detect the loss or theft of sensitive personal information contained in systems operated by third parties, including cloud providers. In addition, the majority are not confident that their organization would be able detect the loss or theft of sensitive personal information in their company’s production environment.
Which aspect of data security for your cloud solution is most important?
1. Is it to protect the data in copies of production/cloud applications used for test or training purposes? For example, do you need to secure data in your Salesforce.com Sandbox?
2. Is it to protect the data so that a user will see data based on her/his role, privileges, location and data privacy rules?
3. Is it to protect the data before it gets to the cloud?
As compliance continues to drive people to action, compliance with contractual agreements, especially for the cloud infrastructure continues to drive investment. In addition, many organizations are supporting Salesforce.com as well as packaged solutions such as Oracle eBusiness, Peoplesoft, SAP, and Siebel.
Of the available data protection solutions, tokenization has been used and is well known for supporting PCI data and preserving the format and width of a table column. But because many tokenization solutions today require creating database views or changing application source code, it has been difficult for organizations to support packaged applications that don’t allow these changes. In addition, databases and applications take a measurable performance hit to process tokens.
What might work better is to dynamically tokenize data before it gets to the cloud. So there would be a transparent layer between the cloud and on-premise data integration that would replace the sensitive data with tokens. In this way, additional code to the application would not be required.
In the Ponemon survey, most said the best control is to dynamically mask sensitive information based on the user’s privilege level. After dynamically masking sensitive data, people said encrypting all sensitive information contained in the record is the best option.
The strange thing is that people recognize there is a problem but are not spending accordingly. In the same survey from Ponemon, 69% of organizations find it difficult to restrict user access to sensitive information in IT and business environments. However, only 33% say they have adequate budgets to invest in the necessary solutions to reduce the insider threat.
Is this an opportunity for you?
Hear Larry Ponemon discuss the survey results in more detail during a CSOonline.com/Computerworld webinar, Data Privacy Challenges and Solutions: Research Findings with Ponemon Institute, on Wednesday, June 13.
Recently, Oracle announced that its latest April critical patch update does not address the TNS Poison vulnerability uncovered by a researcher 4 years ago. In addition to this vulnerability from an attacker, organizations face data breaches from internal negligence and insiders. In a May 2012 survey by the Ponemon Institute, 50% say sensitive data contained in databases and applications has been compromised or stolen by malicious insiders such as privileged users. On top of that 68% find it difficult to restrict user access to sensitive information in IT and business environments.
While databases offer basic security features that can be programmed and configured to protect data, it may not be enough and may not scale with your growing organizations. The problem stems from the fact that application development and DBA teams need to have a solid understanding of database vendor specific offerings in order to ensure that the security feature has been properly set up and deployed. If your organization has a number of different databases (Oracle, DB2, Microsoft SQL Server) and that number is growing, it can be costly to maintain all the database specific solutions. Many Informatica customers have faced this problem and looked to Informatica to provide a complete, end-to-end solution that addresses database security on an enterprise-wide level.
Come talk to us at Informatica World and hear from our customers about how they’ve used Informatica to minimize the risk of breaches across a number of use cases including:
– Test data management
– Production support in off-shore projects
– Dynamically protecting PII or PHI data for research portals
– Dynamically protecting data in cross-border applications
At Informatica, you can meet us in our sessions on Thursday, May 17, at the Aria in Las Vegas:
10:10 – 11:10 – Ensuring Data Privacy for Warehouses and Applications with Informatica Data Masking in Room Juniper 3
11:20 – 12:20 – Protecting Sensitive Data Using Informatica’s Test Data Management Solution in Room Starvine 12
Also come to the Informatica Data Privacy booth and lab for in depth demonstrations and presentations of our data privacy solutions and customer deployments.
Data breaches in healthcare have increased 32 percent in the past year and have cost the industry an estimated $6.5 billion annually according to the Ponemon Institute. Responsible for these breaches were largely employee handling of data and the increasing use of mobile devices. Forty-one percent of healthcare executive surveyed attributed data breaches related to protected health information (PHI) to employee mistakes. Half of the respondents said their organization does nothing to protect the information contained on mobile devices. “Healthcare data breaches are an epidemic,” said Dr. Larry Ponemon, chairman and founder, Ponemon Institute, in an announcement of the study results.
Why are healthcare data breaches becoming more common?
PHI data is in all production and test systems, as well as numerous copies that are created of production systems for test, training and application development purposes. In addition to these production systems, PHI data lives in servers inside and outside of the organization. As more mobile devices are used to access critical patient data, and doctors are using their mobile devices to address medical issues from all over the country (if not the world), more sensitive patient data is exposed. In addition to PHI data such as social security number, a lot of sensitive data that healthcare organizations have is contained in textual notes. So the textual data also needs to be protected. But patient data needs to be protected not only within the hospital or healthcare organization. As patient data is used for clinical trial and research purposes, it is important to protect the data that leaves the organization.
To address these concerns, Informatica has seen organizations move towards an end-to-end, enterprise wide data privacy solution that enables them to:
– Consistently define sensitive data and set data privacy policies
– Identify where sensitive data lives throughout the organization
– Create subsets of production data for testing purposes, greatly reducing costs of managing test data (reducing hardware and software)
– Mask data according to all required PHI rules
– Report / provide audit trail that data has been masked and data is secure
Maintaining many, individual privacy solutions can be both costly and risky. An enterprise wide solution centralizes data privacy management, streamlining development and ongoing maintenance.
For more information on healthcare privacy challenges and how to address them, please join us in our upcoming webinar.
It’s hard to miss data privacy in the headlines these days. Banks and insurance companies have not only had their customer information compromised, but they need to keep up with changing privacy regulations (PCI DSS, GLB, EU Data Protection Directive, US Privacy Laws)—or be fined. The impact is staggering—and costly. For example, last year Citigroup had more information compromised from their 200,000 bank cardholders. HSBC faced $5M in fines for inadequate data security.
But personal information is not the only type of data that needs to be protected. We’ve spoken to our customers about the need to protect sensitive information that includes financial information about a client, revenues, purchasing and pricing information. In addition I’ve spoken to organizations that are looking to keep and protect sensitive information across business units (so that one business unit will have restricted access to another business unit’s data). (more…)
It seems like every day a new data breach splashes across the news. As consumers, patients, customers and social networkers many of us have a plethora of information stored in various databases well outside our control. Data security officers, DBAs and other security specialists continue to do their best to educate, protect and anticipate both internal and external threats. But … the breaches continue and so do their associated costs. There are many technologies from encryption to tokenization to database activity monitoring (DAM) to data loss prevention (DLP).
Informatica just released a new option to the mix: dynamic data masking. The technology came into the company through the acquisition of ActiveBase. Since then I’ve had a number of people ask me if Informatica Dynamic Data Masking will complement or replace an organization’s existing data security technologies.