Wrangling Data Risk: Discovery, Detection, and Protection
RSA Week is that hallowed time of the year when every security professional seeks to learn and discover the latest technology and tricks of the trade that allow them to be more efficient at their jobs. I would like to propose a simple yet holistic solution to their perennial problem – detect and protect.
Discover: Where is my Sensitive Data?
The first step is to define policies concerning sensitive data classification. The policy may originate from compliance needs driven by regulations like PII, PCI or GDPR but they could also encompass all the exposed data that a company might like to be made aware of.
Understand and Monitor Sensitive Data Risks
To get a comprehensive picture of risks associated with sensitive data, one would like to be able to identify data location, volume, proliferation (where the sensitive data is created and how it flows through the organization) and protection status (how the data is currently protected by data security controls).
An organization’s security team may have an understanding of all the security risks but to have a solution that can monitor them can go a long way in not only confirming what the team knows but also alert them of any security blind spots they may have missed.
Having a quantitative measure of chinks in the armor and getting an aggregated risk score on a regular basis can aid the security team prioritize its valuable time. One can focus 80% of the efforts on the 20% of data that is riskiest.
Detecting Anomalous User Activities on Sensitive Data
Data does not breach itself, there is always a user actor involved. To understand the risk associated with users, a security team looks at capturing User Activity and applying UBA (User Behavioral Analytics). UBA profiles and baselines user activities and their peers to assess what constitutes an individual’s normal vs. unusual behavior. Abnormal behaviors around the characteristics, volume, and the combination of data or data stores an employee accesses are more likely to be detected from a data-focused UBA. The additional context supports greater detection accuracy to operationalize more timely protection or remediation to potential threats on sensitive data.1
Sensitive Data Protection: at the Right Level, with the Right Tools
With the influx of several protection tools in the market, companies have begun placing the cart before the horse – incorporating a protection tool without first assessing or identifying what to protect. Gartner recommends the balancing of business needs against Risks/Threat/Compliance2 as a logical starting point for security risk mitigation rather than blind implementation of such tools.
Once we have the prioritized list of data stores after taking into account all the risk factors defined by an organization, the right level of protection can be applied to the right dataset.
Based on the type of dataset and users involved, the organization can choose to define an appropriate protection technique. For instance, you can persistently mask all sensitive data in the non-production environment or tokenize all PCI in the production system.
Data protection requires the right level of control and method of protection (including user authentication, access control, encryption, tokenization, masking, etc.) for all the different data types and silos across the organization. The more challenging aspect is the right level of protection for data not only at rest and during transmission but also for different environments and types of usage (e.g. Dev, Test, Prod, Demo, etc.), and for different users.
Automating Protection After Discovery and Detection
The security team would be more effective if they can define this protection technique and specific tool to be used for each affected area. Such protection techniques could be automated or may require manual intervention by data owner/application owner.
In some cases, it might be preferred to have a fully integrated approach to detection and protection. For instance, if a user is accessing more number of SSN records than he usually does, it might be best to set up an automated way to dynamically mask the data for the user or move him/her to a high-risk user group in LDAP.
If one is looking to mask data in a non-production system, you may need to work with application owner to ensure that the data is masked consistently without breaking the application integrity. The time of the masking execution also needs to take into account the specific business operation running at the target application. So the collaboration between the security team and application team is vital. The right solution must foster a collaborative environment to orchestrate the required business flow.
There are best of breed protection techniques available for different applications like Shield protection for Salesforce or Ranger for Hortonworks Hadoop system. You make like to orchestrate with the right protection tools, not just based on manual hand-offs between the security team and the application owners that takes several weeks to months to complete and close.
After the protection is in place using the method and tool of choice, it is important for the system to mark the data store as protected and adjust the associated risk. The integration between the protection tool and detection system needs to take into account the data and the quantum of protection.
- Low time investment and operational costs – saves much needed manpower hours by establishing an automated protection fed by the detection of risk associated with users and/or data stores
- Policy Compliance – Consistency on how protection is applied for different datasets and ensure that it dovetails with the set policy guidelines
- Collaboration and business buy-in – Allowing flexibility of the right time and methods for protection which in turn ensure data owner/application owner collaborations and buy-ins
- Informatica Blog:: Claudia Chandra: The Missing Link for User Behavioral Analytics
- Gartner: Shift Cybersecurity Investment to Detection and Response