The Stray Sheep or The Power of Understanding Your Outliers

Comodo, who is a leading SSL provider that offers free anti-virus, internet security, firewall endpoint security, and other security tools — a company that specializes in security tools — had a breach. In March 2011 a critical security breach occurred at Comodo, damaging the company reputation and costing them a significant amount to mitigate the exposure. A hacker who used a valid username and password to get into the Comodo certificate issuing systems set up a new user account and issued nine validated requests to apply for certificates. The Certificates were then issued by Comodo to the interloper for leading sites, such as and, exposing not only Comodo, but also its customers and its customers’ users, to a serious risk[i].

The hacker hid behind a legitimate account, so they were authenticated as a legitimate user, and under this umbrella of legitimacy and authentication, did some real damage.

This can happen to any company, where actions taken under the cover of a legitimate account can be as vicious and dangerous as an external threat. A disgruntled employee, a bribed employee, stolen valid usernames and passwords, or even unintentional mistakes performed by a valid user — they can all run havoc in the systems.

How can a company protect itself from illegitimate actions performed by legitimate accounts? By knowing what is a legitimate behavior pattern, thus knowing when there is any deviation or outliers from this pattern, and flagging it.

When a legitimate account demonstrates behavior that is outside of its patterned norm, for example logging in from a different geography than the pattern suggests or taking a “different” action than standard such as a download instead of upload, or even when it just happens in an “odd” time of day — for example, usually an employee access the system at 8:30 in the morning, but the action happened at 9pm — any deviation from a patterned behavior, any outlier, is an indication of a risk.

Data OutliersOne of my previous blog posts discussed how patterns used in Artificial Intelligence can drive productivity gains, but as the example above illustrates, what is outside the pattern is equally as important, and there are a variety of Machine Learning algorithms that do exactly that.

Our CLAIRE™ engine embedded inside Secure@Source uses statistical and machine learning approaches to detect data outliers and anomalies. The User Behavior Analytics (UBA) capability detects patterns of user behavior that might be risky and expose an organization to data misuse. UBA is capable of detecting impersonation, credential hijacking, and privilege escalation attacks. UBA applies unsupervised machine learning to a multi-dimensional model of user activities, which includes the number of data stores accessed by the user, the number of requests made, and the number of affected records across different systems. Principal component analysis is applied to this model for dimensionality reduction. The BIRCH technique is applied for unsupervised hierarchical clustering to find users whose behavior was different during a given period. To validate the anomalous behavior, distance and density-based outlier detection methods are employed, and the statistical Grubbs’ test for outliers is performed to confirm that objects indicated by the first two methods are indeed outliers in the cluster system.

Learn more about Secure@Source here.

[i] Comodo Group Certificate hacking