Which Method of Controls Should You Use to Protect Sensitive Data in Databases and Enterprise Applications? Part II
- Do you need to protect data at rest (in storage), during transmission, and/or when accessed?
- Do some privileged users still need the ability to view the original sensitive data or does sensitive data need to be obfuscated at all levels?
- What is the granularity of controls that you need?
- Datafile level
- Table level
- Row level
- Field / column level
- Cell level
- Do you need to be able to control viewing vs. modification of sensitive data?
- Do you need to maintain the original characteristics / format of the data (e.g. for testing, demo, development purposes)?
- Is response time latency / performance of high importance for the application? This can be the case for mission critical production applications that need to maintain response times in the order of seconds or sub-seconds.
In order to help you determine which method of control is appropriate for your requirements, the following table provides a comparison of the different methods and their characteristics.
A combination of protection method may be appropriate based on your requirements. For example, to protect data in non-production environments, you may want to use persistent data masking to ensure that no one has access to the original production data, since they don’t need to. This is especially true if your development and testing is outsourced to third parties. In addition, persistent data masking allows you to maintain the original characteristics of the data to ensure test data quality.
In production environments, you may want to use a combination of encryption and dynamic data masking. This is the case if you would like to ensure that all data at rest is protected against unauthorized users, yet you need to protect sensitive fields only for certain sets of authorized or privileged users, but the rest of your users should be able to view the data in the clear.
The best method or combination of methods will depend on each scenario and set of requirements for your environment and organization. As with any technology and solution, there is no one size fits all.
Which Method of Controls Should You Use to Protect Sensitive Data in Databases and Enterprise Applications? Part I
- Which types of data should be protected?
- Which data should be classified as “sensitive?”
- Where is this sensitive data located?
- Which groups of users should have access to this data?
Because these questions come up frequently, it seems ideal to share a few guidelines on this topic.
When protecting the confidentiality and integrity of data, the first level of defense is Authentication and access control. However, data with higher levels of sensitivity or confidentiality may require additional levels of protection, beyond regular authentication and authorization methods.
There are a number of control methods for securing sensitive data available in the market today, including:
- Persistent (Static) Data Masking
- Dynamic Data Masking
- Retention management and purging
Encryption is a cryptographic method of encoding data. There are generally, two methods of encryption: symmetric (using single secret key) and asymmetric (using public and private keys). Although there are methods of deciphering encrypted information without possessing the key, a good encryption algorithm makes it very difficult to decode the encrypted data without knowledge of the key. Key management is usually a key concern with this method of control. Encryption is ideal for mass protection of data (e.g. an entire data file, table, partition, etc.) against unauthorized users.
Persistent or static data masking obfuscates data at rest in storage. There is usually no way to retrieve the original data – the data is permanently masked. There are multiple techniques for masking data, including: shuffling, substitution, aging, encryption, domain-specific masking (e.g. email address, IP address, credit card, etc.), dictionary lookup, randomization, etc. Depending on the technique, there may be ways to perform reverse masking – this should be used sparingly. Persistent masking is ideal for cases where all users should not see the original sensitive data (e.g. for test / development environments) and field level data protection is required.
Dynamic data masking de-identifies data when it is accessed. The original data is still stored in the database. Dynamic data masking (DDM) acts as a proxy between the application and database and rewrites the user / application request against the database depending on whether the user has the privilege to view the data or not. If the requested data is not sensitive or the user is a privileged user who has the permission to access the sensitive data, then the DDM proxy passes the request to the database without modification, and the result set is returned to the user in the clear. If the data is sensitive and the user does not have the privilege to view the data, then the DDM proxy rewrites the request to include a masking function and passes the request to the database to execute. The result is returned to the user with the sensitive data masked. Dynamic data masking is ideal for protecting sensitive fields in production systems where application changes are difficult or disruptive to implement and performance / response time is of high importance.
Tokenization substitutes a sensitive data element with a non-sensitive data element or token. The first generation tokenization system requires a token server and a database to store the original sensitive data. The mapping from the clear text to the token makes it very difficult to reverse the token back to the original data without the token system. The existence of a token server and database storing the original sensitive data renders the token server and mapping database as a potential point of security vulnerability, bottleneck for scalability, and single point of failure. Next generation tokenization systems have addressed these weaknesses. However, tokenization does require changes to the application layer to tokenize and detokenize when the sensitive data is accessed. Tokenization can be used in production systems to protect sensitive data at rest in the database store, when changes to the application layer can be made relatively easily to perform the tokenization / detokenization operations.
Retention management and purging is more of a data management method to ensure that data is retained only as long as necessary. The best method of reducing data privacy risk is to eliminate the sensitive data. Therefore, appropriate retention, archiving, and purging policies should be applied to reduce the privacy and legal risks of holding on to sensitive data for too long. Retention management and purging is a data management best practices that should always be put to use.
- A loss of customer trust
- Revenue shortfalls
- A plummeting stock price
- C-level executives losing their jobs
As a result, Data security and privacy has become a key topic of discussion, not just in IT meetings, but in the media and the boardroom.
Preventing access to sensitive data has become more complex than ever before. There are new potential entry points that IT never previously considered. These new options go beyond typical BYOD user devices like smartphones and tablets. Today’s entry points can be much smaller: Things like HVAC controllers, office polycoms and temperature control systems.
So what can organizations do to combat this increasing complexity? Traditional data security practices focus on securing both the perimeter and the endpoints. However, these practices are clearly no longer working and no longer manageable. Not only is the number and type of devices expanding, but the perimeter itself is no longer present. As companies increasingly outsource, off-shore and move operations to the cloud, it is no longer possible fence the perimeters and to keep intruders out. Because 3rd parties often require some form of access, even trusted user credentials may fall into the hands of malicious intruders.
Data security requires a new approach. It must use policies to follow the data and to protect it, regardless of where it is located and where it moves. Informatica is responding to this need. We are leveraging our market leadership and domain expertise in data management and security. We are defining a new data security offering and category. This week, we unveiled our entry into the Data Security market at our Informatica World conference. Our new security offering, Secure@Source™ will allow enterprises to discover, detect and protect sensitive data.
The first step towards protecting sensitive data is to locate and identify them. So Secure@Source™ first allows you discover where all the sensitive data are located in the enterprise and classify them. As part of the discovery, Secure@source also analyzes where sensitive data is being proliferated, who has access to the data, who are actually accessing them and whether the data is protected or unprotected when accessed. Secure@Source™ leverages Informatica’s PowerCenter repository and lineage technology to perform a first pass, quick discovery with a more in depth analysis and profiling over time. The solution allows you to determine the privacy risk index of your enterprise and slice and dice the analysis based on region, departments, organization hierarchy, as well as data classifications.
The longer term vision of Secure@Source™ will allow you to detect suspicious usage patterns and orchestrate the appropriate data protection method, such as: alerting, blocking, archiving and purging, dynamically masking, persistently masking, encrypting, and/or tokenizing the data. The data protection method will depend on whether the data store is a production or non-production system, and whether you would like to de-identify sensitive data across all users or only for some users. All can be deployed based on policies. Secure@Source™ is intended to be an open framework for aggregating data security analytics and will integrate with key partners to provide a comprehensive visibility and assessment of an enterprise data privacy risk.
Secure@Source™ is targeted for beta at the end of 2014 and general availability in early 2015. Informatica is recruiting a select group of charter customers to drive and provide feedback for the first release. Customers who are interested in being a charter customer should register and send email to SecureCustomers@informatica.com.
What is In-Database Archiving in Oracle 12c and Why You Still Need a Database Archiving Solution to Complement It (Part 2)
In my last blog on this topic, I discussed several areas where a database archiving solution can complement or help you to better leverage the Oracle In-Database Archiving feature. For an introduction of what the new In-Database Archiving feature in Oracle 12c is, refer to Part 1 of my blog on this topic.
Here, I will discuss additional areas where a database archiving solution can complement the new Oracle In-Database Archiving feature:
- Graphical UI for ease of administration – In database archiving is currently a technical feature of Oracle database, and not easily visible or mange-able outside of the DBA persona. This is where a database archiving solution provides a more comprehensive set of graphical user interfaces (GUI) that makes this feature easier to monitor and manage.
- Enabling application of In-Database Archiving for packaged applications and complex data models – Concepts of business entities or transactional records composed of related tables to maintain data and referential integrity as you archive, move, purge, and retain data, as well as business rules to determine when data has become inactive and can therefore be safely archived allow DBAs to apply this new Oracle feature to more complex data models. Also, the availability of application accelerators (prebuilt metadata of business entities and business rules for packaged applications) enables the application of In-Database Archiving to packaged applications like Oracle E-Business Suite, PeopleSoft, Siebel, and JD Edwards
What is In-Database Archiving in Oracle 12c and Why You Still Need a Database Archiving Solution to Complement It (Part 1)
What is the new In-Database Archiving in the latest Oracle 12c release?
On June 25, 2013, Oracle introduced a new feature called In-Database Archiving with its new release of Oracle 12. “In-Database Archiving enables you to archive rows within a table by marking them as inactive. These inactive rows are in the database and can be optimized using compression, but are not visible to an application. The data in these rows is available for compliance purposes if needed by setting a session parameter. With In-Database Archiving you can store more data for a longer period of time within a single database, without compromising application performance. Archived data can be compressed to help improve backup performance, and updates to archived data can be deferred during application upgrades to improve the performance of upgrades.”
This is an Oracle specific feature and does not apply to other databases.
Data warehouses tend to grow very quickly because they integrate data from multiple sources and maintain years of historical data for analytics. A number of our customers have data warehouses in the hundreds of terabytes to petabytes range. Managing such a large amount of data becomes a challenge. How do you curb runaway costs in such an environment? Completing maintenance tasks within the prescribed window and ensuring acceptable performance are also big challenges.
We have provided best practices to archive aged data from data warehouses. Archiving data will keep the production data size at almost a constant level, reducing infrastructure and maintenance costs, while keeping performance up. At the same time, you can still access the archived data directly if you really need to from any reporting tool. Yet many are loath to move data out of their production system. This year, at Informatica World, we’re going to discuss another method of managing data growth without moving data out of the production data warehouse. I’m not going to tell you what this new method is, yet. You’ll have to come and learn more about it at my breakout session at Informatica World: What’s New from Informatica to Improve Data Warehouse Performance and Lower Costs.
I look forward to seeing all of you at Aria, Las Vegas next month. Also, I am especially excited to see our ILM customers at our second Product Advisory Council again this year.
I’ve been approached by a number of customers who are looking to archive data from their Salesforce application. There are two primary drivers I have heard cited:
- The need to manage the retention of Salesforce data and easily find and access it for legal eDiscovory
- Storage cost reduction for data that’s no longer active
Just like your on-premise database applications like E-Business Suite, PeopleSoft, Siebel and custom applications, SaaS applications such as Salesforce, Oracle CRM On Demand, Microsoft Dynamics, NetSuite, Eloqua and others will experience large data growth causing performance issues and increasing costs.
As data grows in your SaaS applications, the performance of accessing transactions and reporting will degrade. Your SaaS vendors will also require more time, effort, and cost to maintain and manage this data. Backups, upgrades and replication of these environments will take longer and application availability will be impacted due to longer maintenance windows. Your SaaS application vendors will require more storage to house the additional data and this cost will be passed on to you. (more…)
SAP’s data warehouse solution (SAP BW) provides enterprises the ability to easily build a warehouse over their existing operational systems with pre-defined extraction and reporting objects and methods. Data that is loaded into SAP BW is stored in a layered architecture which encourages reusability of data throughout the system in a standardized way. SAP’s implementation also enables easy audits of data delivery mechanisms that are used to produce various reports within the system.
To allow enterprises to achieve this level of standardization and auditability, SAP BW must persistently store large amounts of data within different layers of their architecture. Managing the size of the objects within these layers will become increasingly important as the system grows to insure high levels of performance for end-user queries and data delivery. (more…)
Both partitioning and archiving are alternative methods of improving database and application performance. Depending on a database administrator’s comfort level for one technology or method over another, either partitioning or archiving could be implemented to address performance issues due to data growth in production applications. But what are the best practices for utilizing one or the other method and how can they be used better together?